Towards a new Benchmark for Emotion Detection in NLP: A Unifying Framework of Recent Corpora

Anna Koufakou, Elijah Nieves, John Peller


Abstract
Emotion recognition in text is a complex and evolving field that has garnered considerable interest. This paper addresses the pressing need to explore and experiment with new corpora annotated with emotions. We identified several corpora presented since 2018. We restricted this study to English single-labeled data. Nevertheless, the datasets vary in source, domain, topic, emotion types, and distributions. As a basis for benchmarking, we conducted emotion detection experiments by fine-tuning a pretrained model and compared our outcomes with results from the original publications. More importantly, in our efforts to combine existing resources, we created a unified corpus from these diverse datasets and evaluated the impact of training on that corpus versus on the training set for each corpus. Our approach aims to streamline research by offering a unified platform for emotion detection to aid comparisons and benchmarking, addressing a significant gap in the current landscape. Additionally, we present a discussion of related practices and challenges. Our code and dataset information are available at https://github.com/a-koufakou/EmoDetect-Unify. We hope this will enable the NLP community to leverage this unified framework towards a new benchmark in emotion detection.
Anthology ID:
2024.genbench-1.13
Volume:
Proceedings of the 2nd GenBench Workshop on Generalisation (Benchmarking) in NLP
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Dieuwke Hupkes, Verna Dankers, Khuyagbaatar Batsuren, Amirhossein Kazemnejad, Christos Christodoulopoulos, Mario Giulianelli, Ryan Cotterell
Venue:
GenBench
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
196–206
Language:
URL:
https://aclanthology.org/2024.genbench-1.13
DOI:
Bibkey:
Cite (ACL):
Anna Koufakou, Elijah Nieves, and John Peller. 2024. Towards a new Benchmark for Emotion Detection in NLP: A Unifying Framework of Recent Corpora. In Proceedings of the 2nd GenBench Workshop on Generalisation (Benchmarking) in NLP, pages 196–206, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Towards a new Benchmark for Emotion Detection in NLP: A Unifying Framework of Recent Corpora (Koufakou et al., GenBench 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.genbench-1.13.pdf