EmoHopeSpeech: An Annotated Dataset of Emotions and Hope Speech in English and Arabic

Wajdi Zaghouani, Md. Rafiul Biswas


Abstract
This research introduces a bilingual dataset comprising 27,456 entries for Arabic and 10,036 entries for English, annotated for emotions and hope speech, addressing the scarcity of multi-emotion (Emotion and hope) datasets. The dataset provides comprehensive annotations capturing emotion intensity, complexity, and causes, alongside detailed classifications and subcategories for hope speech. To ensure annotation reliability, Fleiss’ Kappa was employed, revealing 0.75-0.85 agreement among annotators both for Arabic and English language. The evaluation metrics (micro-F1-Score=0.67) obtained from the baseline model (i.e., transformer-based AraBERT model) validate that the data annotations are worthy.
Anthology ID:
2025.ranlp-1.162
Volume:
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
Month:
September
Year:
2025
Address:
Varna, Bulgaria
Editors:
Galia Angelova, Maria Kunilovskaya, Marie Escribe, Ruslan Mitkov
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
1406–1412
Language:
URL:
https://aclanthology.org/2025.ranlp-1.162/
DOI:
Bibkey:
Cite (ACL):
Wajdi Zaghouani and Md. Rafiul Biswas. 2025. EmoHopeSpeech: An Annotated Dataset of Emotions and Hope Speech in English and Arabic. In Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era, pages 1406–1412, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
EmoHopeSpeech: An Annotated Dataset of Emotions and Hope Speech in English and Arabic (Zaghouani & Biswas, RANLP 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.ranlp-1.162.pdf