CUET_Big_O@NLU of Devanagari Script Languages 2025: Identifying Script Language and Detecting Hate Speech Using Deep Learning and Transformer Model

Md. Refaj Hossan, Nazmus Sakib, Md. Alam Miah, Jawad Hossain, Mohammed Moshiul Hoque


Abstract
Text-based hate speech has been prevalent and is usually used to incite hostility and violence. Detecting this content becomes imperative, yet the task is challenging, particularly for low-resource languages in the Devanagari script, which must have the extensive labeled datasets required for effective machine learning. To address this, a shared task has been organized for identifying hate speech targets in Devanagari-script text. The task involves classifying targets such as individuals, organizations, and communities and identifying different languages within the script. We have explored several machine learning methods such as LR, SVM, MNB, and Random Forest, deep learning models using CNN, BiLSTM, GRU, CNN+BiLSTM, and transformer-based models like Indic-BERT, m-BERT, Verta-BERT, XLM-R, and MuRIL. The CNN with BiLSTM yielded the best performance (F1-score of 0.9941), placing the team 13th in the competition for script identification. Furthermore, the fine-tuned MuRIL-BERT model resulted in an F1 score of 0.6832, ranking us 4th for detecting hate speech targets.
Anthology ID:
2025.chipsal-1.27
Volume:
Proceedings of the First Workshop on Challenges in Processing South Asian Languages (CHiPSAL 2025)
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Kengatharaiyer Sarveswaran, Ashwini Vaidya, Bal Krishna Bal, Sana Shams, Surendrabikram Thapa
Venues:
CHiPSAL | WS
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
253–259
Language:
URL:
https://aclanthology.org/2025.chipsal-1.27/
DOI:
Bibkey:
Cite (ACL):
Md. Refaj Hossan, Nazmus Sakib, Md. Alam Miah, Jawad Hossain, and Mohammed Moshiul Hoque. 2025. CUET_Big_O@NLU of Devanagari Script Languages 2025: Identifying Script Language and Detecting Hate Speech Using Deep Learning and Transformer Model. In Proceedings of the First Workshop on Challenges in Processing South Asian Languages (CHiPSAL 2025), pages 253–259, Abu Dhabi, UAE. International Committee on Computational Linguistics.
Cite (Informal):
CUET_Big_O@NLU of Devanagari Script Languages 2025: Identifying Script Language and Detecting Hate Speech Using Deep Learning and Transformer Model (Hossan et al., CHiPSAL 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.chipsal-1.27.pdf