Uncovering the Limits of Text-based Emotion Detection

Nurudin Alvarez-Gonzalez, Andreas Kaltenbrunner, Vicenç Gómez


Abstract
Identifying emotions from text is crucial for a variety of real world tasks. We consider the two largest now-available corpora for emotion classification: GoEmotions, with 58k messages labelled by readers, and Vent, with 33M writer-labelled messages. We design a benchmark and evaluate several feature spaces and learning algorithms, including two simple yet novel models on top of BERT that outperform previous strong baselines on GoEmotions. Through an experiment with human participants, we also analyze the differences between how writers express emotions and how readers perceive them. Our results suggest that emotions expressed by writers are harder to identify than emotions that readers perceive. We share a public web interface for researchers to explore our models.
Anthology ID:
2021.findings-emnlp.219
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2021
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic
Venue:
Findings
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
2560–2583
Language:
URL:
https://aclanthology.org/2021.findings-emnlp.219
DOI:
10.18653/v1/2021.findings-emnlp.219
Bibkey:
Cite (ACL):
Nurudin Alvarez-Gonzalez, Andreas Kaltenbrunner, and Vicenç Gómez. 2021. Uncovering the Limits of Text-based Emotion Detection. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 2560–2583, Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Uncovering the Limits of Text-based Emotion Detection (Alvarez-Gonzalez et al., Findings 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.findings-emnlp.219.pdf
Video:
 https://aclanthology.org/2021.findings-emnlp.219.mp4
Code
 nur-ag/emotion-classification +  additional community code
Data
GoEmotionsVent