Does Machine Translation Impact Offensive Language Identification? The Case of Indo-Aryan Languages

Alphaeus Dmonte; Shrey Satapara; Rehab Alsudais; Tharindu Ranasinghe; Marcos Zampieri

Does Machine Translation Impact Offensive Language Identification? The Case of Indo-Aryan Languages

Alphaeus Dmonte, Shrey Satapara, Rehab Alsudais, Tharindu Ranasinghe, Marcos Zampieri

Abstract

The accessibility to social media platforms can be improved with the use of machine translation (MT). Non-standard features present in user-generated on social media content such as hashtags, emojis, and alternative spellings can lead to mistranslated instances by the MT systems. In this paper, we investigate the impact of MT on offensive language identification in Indo-Aryan languages. We use both original and MT datasets to evaluate the performance of various offensive language models. Our evaluation indicates that offensive language identification models achieve superior performance on original data than on MT data, and that the models trained on MT data identify offensive language more precisely on MT data than the models trained on original data.

Anthology ID:: 2025.loreslm-1.34
Volume:: Proceedings of the First Workshop on Language Models for Low-Resource Languages
Month:: January
Year:: 2025
Address:: Abu Dhabi, United Arab Emirates
Editors:: Hansi Hettiarachchi, Tharindu Ranasinghe, Paul Rayson, Ruslan Mitkov, Mohamed Gaber, Damith Premasiri, Fiona Anting Tan, Lasitha Uyangodage
Venues:: LoResLM | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 460–468
Language:
URL:: https://aclanthology.org/2025.loreslm-1.34/
DOI:
Bibkey:
Cite (ACL):: Alphaeus Dmonte, Shrey Satapara, Rehab Alsudais, Tharindu Ranasinghe, and Marcos Zampieri. 2025. Does Machine Translation Impact Offensive Language Identification? The Case of Indo-Aryan Languages. In Proceedings of the First Workshop on Language Models for Low-Resource Languages, pages 460–468, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):: Does Machine Translation Impact Offensive Language Identification? The Case of Indo-Aryan Languages (Dmonte et al., LoResLM 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.loreslm-1.34.pdf

PDF Cite Search Fix data