Challenges of Evaluating Sentiment Analysis Tools on Social Media

Diana Maynard, Kalina Bontcheva


Abstract
This paper discusses the challenges in carrying out fair comparative evaluations of sentiment analysis systems. Firstly, these are due to differences in corpus annotation guidelines and sentiment class distribution. Secondly, different systems often make different assumptions about how to interpret certain statements, e.g. tweets with URLs. In order to study the impact of these on evaluation results, this paper focuses on tweet sentiment analysis in particular. One existing and two newly created corpora are used, and the performance of four different sentiment analysis systems is reported; we make our annotated datasets and sentiment analysis applications publicly available. We see considerable variations in results across the different corpora, which calls into question the validity of many existing annotated datasets and evaluations, and we make some observations about both the systems and the datasets as a result.
Anthology ID:
L16-1182
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1142–1148
Language:
URL:
https://aclanthology.org/L16-1182
DOI:
Bibkey:
Cite (ACL):
Diana Maynard and Kalina Bontcheva. 2016. Challenges of Evaluating Sentiment Analysis Tools on Social Media. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 1142–1148, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
Challenges of Evaluating Sentiment Analysis Tools on Social Media (Maynard & Bontcheva, LREC 2016)
Copy Citation:
PDF:
https://aclanthology.org/L16-1182.pdf