1-800-SHARED-TASKS@NLU of Devanagari Script Languages 2025: Detection of Language, Hate Speech, and Targets using LLMs

Jebish Purbey; Siddartha Pullakhandam; Kanwal Mehreen; Muhammad Arham; Drishti Sharma; Ashay Srivastava; Ram Mohan Rao Kadiyala

1-800-SHARED-TASKS@NLU of Devanagari Script Languages 2025: Detection of Language, Hate Speech, and Targets using LLMs

Jebish Purbey, Siddartha Pullakhandam, Kanwal Mehreen, Muhammad Arham, Drishti Sharma, Ashay Srivastava, Ram Mohan Rao Kadiyala

Abstract

This paper presents a detailed system description of our entry for the CHiPSAL 2025 challenge, focusing on language detection, hate speech identification, and target detection in Devanagari script languages. We experimented with a combination of large language models and their ensembles, including MuRIL, IndicBERT, and Gemma-2, and leveraged unique techniques like focal loss to address challenges in the natural understanding of Devanagari languages, such as multilingual processing and class imbalance. Our approach achieved competitive results across all tasks: F1 of 0.9980, 0.7652, and 0.6804 for Sub-tasks A, B, and C respectively. This work provides insights into the effectiveness of transformer models in tasks with domain-specific and linguistic challenges, as well as areas for potential improvement in future iterations.

Anthology ID:: 2025.chipsal-1.23
Volume:: Proceedings of the First Workshop on Challenges in Processing South Asian Languages (CHiPSAL 2025)
Month:: January
Year:: 2025
Address:: Abu Dhabi, UAE
Editors:: Kengatharaiyer Sarveswaran, Ashwini Vaidya, Bal Krishna Bal, Sana Shams, Surendrabikram Thapa
Venues:: CHiPSAL | WS
SIG:
Publisher:: International Committee on Computational Linguistics
Note:
Pages:: 223–235
Language:
URL:: https://aclanthology.org/2025.chipsal-1.23/
DOI:
Bibkey:
Cite (ACL):: Jebish Purbey, Siddartha Pullakhandam, Kanwal Mehreen, Muhammad Arham, Drishti Sharma, Ashay Srivastava, and Ram Mohan Rao Kadiyala. 2025. 1-800-SHARED-TASKS@NLU of Devanagari Script Languages 2025: Detection of Language, Hate Speech, and Targets using LLMs. In Proceedings of the First Workshop on Challenges in Processing South Asian Languages (CHiPSAL 2025), pages 223–235, Abu Dhabi, UAE. International Committee on Computational Linguistics.
Cite (Informal):: 1-800-SHARED-TASKS@NLU of Devanagari Script Languages 2025: Detection of Language, Hate Speech, and Targets using LLMs (Purbey et al., CHiPSAL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.chipsal-1.23.pdf

PDF Cite Search Fix data