A Thesaurus-based Sentiment Lexicon for Danish: The Danish Sentiment Lexicon

Sanni Nimb, Sussi Olsen, Bolette Pedersen, Thomas Troelsgård


Abstract
This paper describes how a newly published Danish sentiment lexicon with a high lexical coverage was compiled by use of lexicographic methods and based on the links between groups of words listed in semantic order in a thesaurus and the corresponding word sense descriptions in a comprehensive monolingual dictionary. The overall idea was to identify negative and positive sections in a thesaurus, extract the words from these sections and combine them with the dictionary information via the links. The annotation task of the dataset included several steps, and was based on the comparison of synonyms and near synonyms within a semantic field. In the cases where one of the words were included in the smaller Danish sentiment lexicon AFINN, its value there was used as inspiration and expanded to the synonyms when appropriate. In order to obtain a more practical lexicon with overall polarity values at lemma level, all the senses of the lemma were afterwards compared, taking into consideration dictionary information such as usage, style and frequency. The final lexicon contains 13,859 Danish polarity lemmas and includes morphological information. It is freely available at https://github.com/dsldk/danish-sentiment-lexicon (licence CC-BY-SA 4.0 International).
Anthology ID:
2022.lrec-1.302
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
2826–2832
Language:
URL:
https://aclanthology.org/2022.lrec-1.302
DOI:
Bibkey:
Cite (ACL):
Sanni Nimb, Sussi Olsen, Bolette Pedersen, and Thomas Troelsgård. 2022. A Thesaurus-based Sentiment Lexicon for Danish: The Danish Sentiment Lexicon. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 2826–2832, Marseille, France. European Language Resources Association.
Cite (Informal):
A Thesaurus-based Sentiment Lexicon for Danish: The Danish Sentiment Lexicon (Nimb et al., LREC 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.lrec-1.302.pdf