EmoEvent: A Multilingual Emotion Corpus based on different Events

Flor Miriam Plaza del Arco, Carlo Strapparava, L. Alfonso Urena Lopez, Maite Martin


Abstract
In recent years emotion detection in text has become more popular due to its potential applications in fields such as psychology, marketing, political science, and artificial intelligence, among others. While opinion mining is a well-established task with many standard data sets and well-defined methodologies, emotion mining has received less attention due to its complexity. In particular, the annotated gold standard resources available are not enough. In order to address this shortage, we present a multilingual emotion data set based on different events that took place in April 2019. We collected tweets from the Twitter platform. Then one of seven emotions, six Ekman’s basic emotions plus the “neutral or other emotions”, was labeled on each tweet by 3 Amazon MTurkers. A total of 8,409 in Spanish and 7,303 in English were labeled. In addition, each tweet was also labeled as offensive or no offensive. We report some linguistic statistics about the data set in order to observe the difference between English and Spanish speakers when they express emotions related to the same events. Moreover, in order to validate the effectiveness of the data set, we also propose a machine learning approach for automatically detecting emotions in tweets for both languages, English and Spanish.
Anthology ID:
2020.lrec-1.186
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
1492–1498
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.186
DOI:
Bibkey:
Cite (ACL):
Flor Miriam Plaza del Arco, Carlo Strapparava, L. Alfonso Urena Lopez, and Maite Martin. 2020. EmoEvent: A Multilingual Emotion Corpus based on different Events. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 1492–1498, Marseille, France. European Language Resources Association.
Cite (Informal):
EmoEvent: A Multilingual Emotion Corpus based on different Events (Plaza del Arco et al., LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.186.pdf