Zaima Taheri


2023

pdf bib
BEmoLexBERT: A Hybrid Model for Multilabel Textual Emotion Classification in Bangla by Combining Transformers with Lexicon Features
Ahasan Kabir | Animesh Roy | Zaima Taheri
Proceedings of the First Workshop on Bangla Language Processing (BLP-2023)

Multilevel textual emotion classification involves the extraction of emotions from text data, a task that has seen significant progress in high resource languages. However, resource-constrained languages like Bangla have received comparatively less attention in the field of emotion classification. Furthermore, the availability of a comprehensive and accurate emotion lexiconspecifically designed for the Bangla language is limited. In this paper, we present a hybrid model that combines lexicon features with transformers for multilabel emotion classification in the Bangla language. We have developed a comprehensive Bangla emotion lexicon consisting of 5336 carefully curated lexicons across nine emotion categories. We experimented with pre-trained transformers including mBERT, XLM-R, BanglishBERT, and BanglaBERT on the EmoNaBa (Islam et al.,2022) dataset. By integrating lexicon features from our emotion lexicon, we evaluate the performance of these transformers in emotion detection tasks. The results demonstrate that incorporating lexicon features significantly improves the performance of transformers. Among the evaluated models, our hybrid approach achieves the highest performance using BanglaBERT(large) (Bhattacharjee et al., 2022) as the pre-trained transformer along with our emotion lexicon, achieving an impressive weighted F1 score of 82.73%. The emotion lexicon is publicly available at https://github.com/Ahasannn/BEmoLex-Bangla_Emotion_Lexicon