byteSizedLLM@NLU of Devanagari Script Languages 2025: Hate Speech Detection and Target Identification Using Customized Attention BiLSTM and XLM-RoBERTa Base Embeddings

Rohith Gowtham Kodali; Durga Prasad Manukonda; Daniel Iglesias

byteSizedLLM@NLU of Devanagari Script Languages 2025: Hate Speech Detection and Target Identification Using Customized Attention BiLSTM and XLM-RoBERTa Base Embeddings

Rohith Gowtham Kodali, Durga Prasad Manukonda, Daniel Iglesias

Abstract

This paper presents a novel approach to hate speech detection and target identification across Devanagari-script languages, with a focus on Hindi and Nepali. Leveraging an Attention BiLSTM-XLM-RoBERTa architecture, our model effectively captures language-specific features and sequential dependencies crucial for multilingual natural language understanding (NLU). In Task B (Hate Speech Detection), our model achieved a Macro F1 score of 0.7481, demonstrating its robustness in identifying hateful content across linguistic variations. For Task C (Target Identification), it reached a Macro F1 score of 0.6715, highlighting its ability to classify targets into “individual,” “organization,” and “community” with high accuracy. Our work addresses the gap in Devanagari-scripted multilingual hate speech analysis and sets a benchmark for future research in low-resource language contexts.

Anthology ID:: 2025.chipsal-1.25
Volume:: Proceedings of the First Workshop on Challenges in Processing South Asian Languages (CHiPSAL 2025)
Month:: January
Year:: 2025
Address:: Abu Dhabi, UAE
Editors:: Kengatharaiyer Sarveswaran, Ashwini Vaidya, Bal Krishna Bal, Sana Shams, Surendrabikram Thapa
Venues:: CHiPSAL | WS
SIG:
Publisher:: International Committee on Computational Linguistics
Note:
Pages:: 242–247
Language:
URL:: https://aclanthology.org/2025.chipsal-1.25/
DOI:
Bibkey:
Cite (ACL):: Rohith Gowtham Kodali, Durga Prasad Manukonda, and Daniel Iglesias. 2025. byteSizedLLM@NLU of Devanagari Script Languages 2025: Hate Speech Detection and Target Identification Using Customized Attention BiLSTM and XLM-RoBERTa Base Embeddings. In Proceedings of the First Workshop on Challenges in Processing South Asian Languages (CHiPSAL 2025), pages 242–247, Abu Dhabi, UAE. International Committee on Computational Linguistics.
Cite (Informal):: byteSizedLLM@NLU of Devanagari Script Languages 2025: Hate Speech Detection and Target Identification Using Customized Attention BiLSTM and XLM-RoBERTa Base Embeddings (Kodali et al., CHiPSAL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.chipsal-1.25.pdf

PDF Cite Search Fix data