Debunking Sentiment Lexicons: A Case of Domain-Specific Sentiment Classification for Croatian

Paula Gombar, Zoran Medić, Domagoj Alagić, Jan Šnajder


Abstract
Sentiment lexicons are widely used as an intuitive and inexpensive way of tackling sentiment classification, often within a simple lexicon word-counting approach or as part of a supervised model. However, it is an open question whether these approaches can compete with supervised models that use only word-representation features. We address this question in the context of domain-specific sentiment classification for Croatian. We experiment with the graph-based acquisition of sentiment lexicons, analyze their quality, and investigate how effectively they can be used in sentiment classification. Our results indicate that, even with as few as 500 labeled instances, a supervised model substantially outperforms a word-counting model. We also observe that adding lexicon-based features does not significantly improve supervised sentiment classification.
Anthology ID:
W17-1409
Volume:
Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing
Month:
April
Year:
2017
Address:
Valencia, Spain
Editors:
Tomaž Erjavec, Jakub Piskorski, Lidia Pivovarova, Jan Šnajder, Josef Steinberger, Roman Yangarber
Venue:
BSNLP
SIG:
SIGSLAV
Publisher:
Association for Computational Linguistics
Note:
Pages:
54–59
Language:
URL:
https://aclanthology.org/W17-1409
DOI:
10.18653/v1/W17-1409
Bibkey:
Cite (ACL):
Paula Gombar, Zoran Medić, Domagoj Alagić, and Jan Šnajder. 2017. Debunking Sentiment Lexicons: A Case of Domain-Specific Sentiment Classification for Croatian. In Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing, pages 54–59, Valencia, Spain. Association for Computational Linguistics.
Cite (Informal):
Debunking Sentiment Lexicons: A Case of Domain-Specific Sentiment Classification for Croatian (Gombar et al., BSNLP 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-1409.pdf