Lost in Translation: Benchmarking Commercial Machine Translation Models for Dyslexic-Style Text

Gregory Price; Shaomei Wu

doi:10.18653/v1/2025.findings-acl.708

Lost in Translation: Benchmarking Commercial Machine Translation Models for Dyslexic-Style Text

Abstract

Dyslexia can affect writing, leading to unique patterns such as letter and homophone swapping. As a result, text produced by people with dyslexia often differs from the text typically used to train natural language processing (NLP) models, raising concerns about their effectiveness for dyslexic users. This paper examines the fairness of four commercial machine translation (MT) systems towards dyslexic text through a systematic audit using both synthetically generated dyslexic text and real writing from individuals with dyslexia. By programmatically introducing various dyslexic-style errors into the WMT dataset, we present insights on how dyslexic biases manifest in MT systems as the text becomes more dyslexic, especially with real-word errors. Our results shed light on the NLP biases affecting people with dyslexia – a population that often relies on NLP tools as assistive technologies, highlighting the need for more diverse data and user representation in the development of foundational NLP models.

Anthology ID:: 2025.findings-acl.708
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 13771–13782
Language:
URL:: https://aclanthology.org/2025.findings-acl.708/
DOI:: 10.18653/v1/2025.findings-acl.708
Bibkey:
Cite (ACL):: Gregory Price and Shaomei Wu. 2025. Lost in Translation: Benchmarking Commercial Machine Translation Models for Dyslexic-Style Text. In Findings of the Association for Computational Linguistics: ACL 2025, pages 13771–13782, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Lost in Translation: Benchmarking Commercial Machine Translation Models for Dyslexic-Style Text (Price & Wu, Findings 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.findings-acl.708.pdf

PDF Cite Search Fix data