Robustness of LLMs to Transliteration Perturbations in Bangla

Fabiha Haider; Md Farhan Ishmam; Fariha Tanjim Shifat; Md Tasmim Rahman Adib; Md Fahim; Md Farhad Alam Bhuiyan

Robustness of LLMs to Transliteration Perturbations in Bangla

Fabiha Haider, Md Farhan Ishmam, Fariha Tanjim Shifat, Md Tasmim Rahman Adib, Md Fahim, Md Farhad Alam Bhuiyan

Abstract

Bangla text on the internet often appears in mixed scripts that combine native Bangla characters with their Romanized transliterations. To ensure practical usability, language models should be robust to naturally occurring script mixing. Our work investigates the robustness of current LLMs and Bangla language models under various transliteration-based textual perturbations, i.e., we augment portions of existing Bangla datasets using transliteration. Specifically, we replace words and sentences with their transliterated text to emulate realistic script mixing, and similarly, replace the top k salient words to emulate adversarial script mixing. Our experiments reveal interesting behavioral insights and vulnerabilities to robustness in language models for Bangla, which can be crucial for deploying such models in real-world scenarios and enhancing their overall robustness.

Anthology ID:: 2025.banglalp-1.27
Volume:: Proceedings of the Second Workshop on Bangla Language Processing (BLP-2025)
Month:: December
Year:: 2025
Address:: Mumbai, India
Editors:: Firoj Alam, Sudipta Kar, Shammur Absar Chowdhury, Naeemul Hassan, Enamul Hoque Prince, Mohiuddin Tasnim, Md Rashad Al Hasan Rony, Md Tahmid Rahman Rahman
Venues:: BanglaLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 338–346
Language:
URL:: https://aclanthology.org/2025.banglalp-1.27/
DOI:
Bibkey:
Cite (ACL):: Fabiha Haider, Md Farhan Ishmam, Fariha Tanjim Shifat, Md Tasmim Rahman Adib, Md Fahim, and Md Farhad Alam Bhuiyan. 2025. Robustness of LLMs to Transliteration Perturbations in Bangla. In Proceedings of the Second Workshop on Bangla Language Processing (BLP-2025), pages 338–346, Mumbai, India. Association for Computational Linguistics.
Cite (Informal):: Robustness of LLMs to Transliteration Perturbations in Bangla (Haider et al., BanglaLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.banglalp-1.27.pdf

PDF Cite Search Fix data