Arabic Dialect Translation with Small LLMs: Enhancing through Reasoning-Oriented Reinforcement Learning

Sohaila Abdulsattar, Keith Ross


Abstract
Arabic dialect↔English machine translation remains difficult due to extreme dialect variation, inconsistent orthography, and limited parallel data. Moreover, dialect translation is often needed in remote regions or by economically-disadvantaged communities, which often operate in compute-constrained or offline settings. Motivated by these concerns, in this paper we explore optimizing Arabic dialect↔English translators that run over small LLMs, which could be implemented on small offline devices. We show that reasoning-oriented reinforcement learning can substantially improve small multilingual LLMs for Arabic dialect translation. Using the MADAR corpus, small Qwen-2.5 models trained with a think-then-translate template and optimized with Group-Relative Policy Optimization using a SacreBLEU reward outperform a much larger 7B baseline trained with supervised fine-tuning. The dialect-to-English BLEU score more than doubles from 17.4 to 34.9, while the English-to-dialect COMET score improves from 0.57 to 0.73.
Anthology ID:
2026.abjadnlp-1.11
Volume:
Proceedings of the 2nd Workshop on NLP for Languages Using Arabic Script
Month:
March
Year:
2026
Address:
Rabat, Morocco
Venues:
AbjadNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
84–99
Language:
URL:
https://aclanthology.org/2026.abjadnlp-1.11/
DOI:
Bibkey:
Cite (ACL):
Sohaila Abdulsattar and Keith Ross. 2026. Arabic Dialect Translation with Small LLMs: Enhancing through Reasoning-Oriented Reinforcement Learning. In Proceedings of the 2nd Workshop on NLP for Languages Using Arabic Script, pages 84–99, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
Arabic Dialect Translation with Small LLMs: Enhancing through Reasoning-Oriented Reinforcement Learning (Abdulsattar & Ross, AbjadNLP 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.abjadnlp-1.11.pdf