Anna Tramarin


2022

pdf bib
NewYeS: A Corpus of New Year’s Speeches with a Comparative Analysis
Anna Tramarin | Carlo Strapparava
Proceedings of the LREC 2022 workshop on Natural Language Processing for Political Sciences

This paper introduces the NewYeS corpus, which contains the Christmas messages and New Year’s speeches held at the end of the year by the heads of state of different European countries (namely Denmark, France, Italy, Norway, Spain and the United Kingdom). The corpus was collected via web scraping of the speech transcripts available online. A comparative analysis was conducted to examine some of the cultural differences showing through the texts, namely a frequency distribution analysis of the term “God” and the identification of the three most frequent content words per year, with a focus on years in which significant historical events happened. An analysis of positive and negative emotion scores, examined along with the frequency of religious references, was carried out for those countries whose languages are supported by LIWC, a tool for sentiment analysis. The corpus is available for further analyses, both comparative (across countries) and diachronic (over the years).