Crowdsourcing-based Annotation of Emotions in Filipino and English Tweets

Fermin Roberto Lapitan, Riza Theresa Batista-Navarro, Eliezer Albacea


Abstract
The automatic analysis of emotions conveyed in social media content, e.g., tweets, has many beneficial applications. In the Philippines, one of the most disaster-prone countries in the world, such methods could potentially enable first responders to make timely decisions despite the risk of data deluge. However, recognising emotions expressed in Philippine-generated tweets, which are mostly written in Filipino, English or a mix of both, is a non-trivial task. In order to facilitate the development of natural language processing (NLP) methods that will automate such type of analysis, we have built a corpus of tweets whose predominant emotions have been manually annotated by means of crowdsourcing. Defining measures ensuring that only high-quality annotations were retained, we have produced a gold standard corpus of 1,146 emotion-labelled Filipino and English tweets. We validate the value of this manually produced resource by demonstrating that an automatic emotion-prediction method based on the use of a publicly available word-emotion association lexicon was unable to reproduce the labels assigned via crowdsourcing. While we are planning to make a few extensions to the corpus in the near future, its current version has been made publicly available in order to foster the development of emotion analysis methods based on advanced Filipino and English NLP.
Anthology ID:
W16-3708
Volume:
Proceedings of the 6th Workshop on South and Southeast Asian Natural Language Processing (WSSANLP2016)
Month:
December
Year:
2016
Address:
Osaka, Japan
Venues:
WS | WSSANLP
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
74–82
Language:
URL:
https://aclanthology.org/W16-3708
DOI:
Bibkey:
Cite (ACL):
Fermin Roberto Lapitan, Riza Theresa Batista-Navarro, and Eliezer Albacea. 2016. Crowdsourcing-based Annotation of Emotions in Filipino and English Tweets. In Proceedings of the 6th Workshop on South and Southeast Asian Natural Language Processing (WSSANLP2016), pages 74–82, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Crowdsourcing-based Annotation of Emotions in Filipino and English Tweets (Lapitan et al., 2016)
Copy Citation:
PDF:
https://aclanthology.org/W16-3708.pdf