On the Impact of Seed Words on Sentiment Polarity Lexicon Induction

Dame Jovanoski, Veno Pachovski, Preslav Nakov


Abstract
Sentiment polarity lexicons are key resources for sentiment analysis, and researchers have invested a lot of efforts in their manual creation. However, there has been a recent shift towards automatically extracted lexicons, which are orders of magnitude larger and perform much better. These lexicons are typically mined using bootstrapping, starting from very few seed words whose polarity is given, e.g., 50-60 words, and sometimes even just 5-6. Here we demonstrate that much higher-quality lexicons can be built by starting with hundreds of words and phrases as seeds, especially when they are in-domain. Thus, we combine (i) mid-sized high-quality manually crafted lexicons as seeds and (ii) bootstrapping, in order to build large-scale lexicons.
Anthology ID:
C16-1147
Volume:
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
Month:
December
Year:
2016
Address:
Osaka, Japan
Editors:
Yuji Matsumoto, Rashmi Prasad
Venue:
COLING
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
1557–1567
Language:
URL:
https://aclanthology.org/C16-1147
DOI:
Bibkey:
Cite (ACL):
Dame Jovanoski, Veno Pachovski, and Preslav Nakov. 2016. On the Impact of Seed Words on Sentiment Polarity Lexicon Induction. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 1557–1567, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
On the Impact of Seed Words on Sentiment Polarity Lexicon Induction (Jovanoski et al., COLING 2016)
Copy Citation:
PDF:
https://aclanthology.org/C16-1147.pdf
Code
 badc0re/sent-lex