Arabic Dialect Translation with Small LLMs: Enhancing through Reasoning-Oriented Reinforcement Learning

Sohaila Abdulsattar; Keith Ross

Arabic Dialect Translation with Small LLMs: Enhancing through Reasoning-Oriented Reinforcement Learning

Abstract

Arabic dialect↔English machine translation remains difficult due to extreme dialect variation, inconsistent orthography, and limited parallel data. Moreover, dialect translation is often needed in remote regions or by economically-disadvantaged communities, which often operate in compute-constrained or offline settings. Motivated by these concerns, in this paper we explore optimizing Arabic dialect↔English translators that run over small LLMs, which could be implemented on small offline devices. We show that reasoning-oriented reinforcement learning can substantially improve small multilingual LLMs for Arabic dialect translation. Using the MADAR corpus, small Qwen-2.5 models trained with a think-then-translate template and optimized with Group-Relative Policy Optimization using a SacreBLEU reward outperform a much larger 7B baseline trained with supervised fine-tuning. The dialect-to-English BLEU score more than doubles from 17.4 to 34.9, while the English-to-dialect COMET score improves from 0.57 to 0.73.

Anthology ID:: 2026.abjadnlp-1.11
Volume:: Proceedings of the 2nd Workshop on NLP for Languages Using Arabic Script
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Venues:: AbjadNLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 84–99
Language:
URL:: https://aclanthology.org/2026.abjadnlp-1.11/
DOI:
Bibkey:
Cite (ACL):: Sohaila Abdulsattar and Keith Ross. 2026. Arabic Dialect Translation with Small LLMs: Enhancing through Reasoning-Oriented Reinforcement Learning. In Proceedings of the 2nd Workshop on NLP for Languages Using Arabic Script, pages 84–99, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: Arabic Dialect Translation with Small LLMs: Enhancing through Reasoning-Oriented Reinforcement Learning (Abdulsattar & Ross, AbjadNLP 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.abjadnlp-1.11.pdf

PDF Cite Search Fix data