CUFE@NLU of Devanagari Script Languages 2025: Language Identification using fastText

Michael Ibrahim


Abstract
Language identification is a critical area of research within natural language processing (NLP), particularly in multilingual contexts where accurate language detection can enhance the performance of various applications, such as machine translation, content moderation, and user interaction systems. This paper presents a language identification system developed using fastText. In the CHIPSAL@COLING 2025 Task on Devanagari Script Language Identification, the proposed method achieved first place, with an F1 score of 0.9997.
Anthology ID:
2025.chipsal-1.30
Volume:
Proceedings of the First Workshop on Challenges in Processing South Asian Languages (CHiPSAL 2025)
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Kengatharaiyer Sarveswaran, Ashwini Vaidya, Bal Krishna Bal, Sana Shams, Surendrabikram Thapa
Venues:
CHiPSAL | WS
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
273–277
Language:
URL:
https://aclanthology.org/2025.chipsal-1.30/
DOI:
Bibkey:
Cite (ACL):
Michael Ibrahim. 2025. CUFE@NLU of Devanagari Script Languages 2025: Language Identification using fastText. In Proceedings of the First Workshop on Challenges in Processing South Asian Languages (CHiPSAL 2025), pages 273–277, Abu Dhabi, UAE. International Committee on Computational Linguistics.
Cite (Informal):
CUFE@NLU of Devanagari Script Languages 2025: Language Identification using fastText (Ibrahim, CHiPSAL 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.chipsal-1.30.pdf