Data Set for Stance and Sentiment Analysis from User Comments on Croatian News

Mihaela Bošnjak, Mladen Karan


Abstract
Nowadays it is becoming more important than ever to find new ways of extracting useful information from the evergrowing amount of user-generated data available online. In this paper, we describe the creation of a data set that contains news articles and corresponding comments from Croatian news outlet 24 sata. Our annotation scheme is specifically tailored for the task of detecting stances and sentiment from user comments as well as assessing if commentator claims are verifiable. Through this data, we hope to get a better understanding of the publics viewpoint on various events. In addition, we also explore the potential of applying supervised machine learning models toautomate annotation of more data.
Anthology ID:
W19-3707
Volume:
Proceedings of the 7th Workshop on Balto-Slavic Natural Language Processing
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Tomaž Erjavec, Michał Marcińczuk, Preslav Nakov, Jakub Piskorski, Lidia Pivovarova, Jan Šnajder, Josef Steinberger, Roman Yangarber
Venue:
BSNLP
SIG:
SIGSLAV
Publisher:
Association for Computational Linguistics
Note:
Pages:
50–55
Language:
URL:
https://aclanthology.org/W19-3707
DOI:
10.18653/v1/W19-3707
Bibkey:
Cite (ACL):
Mihaela Bošnjak and Mladen Karan. 2019. Data Set for Stance and Sentiment Analysis from User Comments on Croatian News. In Proceedings of the 7th Workshop on Balto-Slavic Natural Language Processing, pages 50–55, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Data Set for Stance and Sentiment Analysis from User Comments on Croatian News (Bošnjak & Karan, BSNLP 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-3707.pdf