Crowdsourcing-based Annotation of Emotions in Filipino and English Tweets

Fermin Roberto Lapitan; Riza Theresa Batista-Navarro; Eliezer Albacea

Crowdsourcing-based Annotation of Emotions in Filipino and English Tweets

Fermin Roberto Lapitan, Riza Theresa Batista-Navarro, Eliezer Albacea

Abstract

The automatic analysis of emotions conveyed in social media content, e.g., tweets, has many beneficial applications. In the Philippines, one of the most disaster-prone countries in the world, such methods could potentially enable first responders to make timely decisions despite the risk of data deluge. However, recognising emotions expressed in Philippine-generated tweets, which are mostly written in Filipino, English or a mix of both, is a non-trivial task. In order to facilitate the development of natural language processing (NLP) methods that will automate such type of analysis, we have built a corpus of tweets whose predominant emotions have been manually annotated by means of crowdsourcing. Defining measures ensuring that only high-quality annotations were retained, we have produced a gold standard corpus of 1,146 emotion-labelled Filipino and English tweets. We validate the value of this manually produced resource by demonstrating that an automatic emotion-prediction method based on the use of a publicly available word-emotion association lexicon was unable to reproduce the labels assigned via crowdsourcing. While we are planning to make a few extensions to the corpus in the near future, its current version has been made publicly available in order to foster the development of emotion analysis methods based on advanced Filipino and English NLP.

Anthology ID:: W16-3708
Volume:: Proceedings of the 6th Workshop on South and Southeast Asian Natural Language Processing (WSSANLP2016)
Month:: December
Year:: 2016
Address:: Osaka, Japan
Editors:: Dekai Wu, Pushpak Bhattacharyya
Venue:: WSSANLP
SIG:
Publisher:: The COLING 2016 Organizing Committee
Note:
Pages:: 74–82
Language:
URL:: https://aclanthology.org/W16-3708/
DOI:
Bibkey:
Cite (ACL):: Fermin Roberto Lapitan, Riza Theresa Batista-Navarro, and Eliezer Albacea. 2016. Crowdsourcing-based Annotation of Emotions in Filipino and English Tweets. In Proceedings of the 6th Workshop on South and Southeast Asian Natural Language Processing (WSSANLP2016), pages 74–82, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):: Crowdsourcing-based Annotation of Emotions in Filipino and English Tweets (Lapitan et al., WSSANLP 2016)
Copy Citation:
PDF:: https://aclanthology.org/W16-3708.pdf

PDF Cite Search Fix data