Are Small Language Models the Silver Bullet to Low-Resource Languages Machine Translation?

Yewei Song; Lujun Li; Cedric Lothritz; Saad Ezzini; Lama Sleem; Niccolo’ Gentile; Radu State; Tegawendé F. Bissyandé; Jacques Klein

Are Small Language Models the Silver Bullet to Low-Resource Languages Machine Translation?

Yewei Song, Lujun Li, Cedric Lothritz, Saad Ezzini, Lama Sleem, Niccolo' Gentile, Radu State, Tegawendé F. Bissyandé, Jacques Klein

Abstract

Small language models (SLMs) offer computationally efficient alternatives to large language models, yet their translation quality for low-resource languages (LRLs) remains severely limited. This work presents the first large-scale evaluation of SLMs across 200 languages, revealing systematic underperformance in LRLs and identifying key sources of linguistic disparity. We show that knowledge distillation from strong teacher models using predominantly monolingual LRL data substantially boosts SLM translation quality—often enabling 2B–3B models to match or surpass systems up to 70B parameters. Our study highlights three core findings: (1) a comprehensive benchmark exposing the limitations of SLMs on 200 languages; (2) evidence that LRL-focused distillation improves translation without inducing catastrophic forgetting, with full-parameter fine-tuning and decoder-only teachers outperforming LoRA and encoder–decoder approaches; and (3) consistent cross-lingual gains demonstrating the scalability and robustness of the method. These results establish an effective, low-cost pathway for improving LRL translation and provide practical guidance for deploying SLMs in truly low-resource settings.

Anthology ID:: 2026.loresmt-1.1
Volume:: Proceedings for the Ninth Workshop on Technologies for Machine Translation of Low Resource Languages (LoResMT 2026)
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Atul Kr. Ojha, Chao-hong Liu, Ekaterina Vylomova, Flammie Pirinen, Jonathan Washington, Nathaniel Oco, Xiaobing Zhao
Venues:: LoResMT | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1–26
Language:
URL:: https://aclanthology.org/2026.loresmt-1.1/
DOI:
Bibkey:
Cite (ACL):: Yewei Song, Lujun Li, Cedric Lothritz, Saad Ezzini, Lama Sleem, Niccolo' Gentile, Radu State, Tegawendé F. Bissyandé, and Jacques Klein. 2026. Are Small Language Models the Silver Bullet to Low-Resource Languages Machine Translation?. In Proceedings for the Ninth Workshop on Technologies for Machine Translation of Low Resource Languages (LoResMT 2026), pages 1–26, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: Are Small Language Models the Silver Bullet to Low-Resource Languages Machine Translation? (Song et al., LoResMT 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.loresmt-1.1.pdf

PDF Cite Search Fix data