RBG-AI: Benefits of Multilingual Language Models for Low-Resource Languages

Barathi Ganesh Hb, Michal Ptaszynski


Abstract
This paper investigates how multilingual language models benefit low-resource languages through our submission to the WMT 2025 Low-Resource Indic Language Translation shared task. We explore whether languages from related families can effectively support translation for low-resource languages that were absent or underrepresented during model training. Using a quantized multilingual pretrained foundation model, we examine zero-shot translation capabilities and cross-lingual transfer effects across three language families: Tibeto-Burman, Indo-Aryan, and Austroasiatic. Our findings demonstrate that multilingual models failed to leverage linguistic similarities, particularly evidenced within the Tibeto-Burman family. The study provides insights into the practical feasibility of zero-shot translation for low-resource language settings and the role of language family relationships in multilingual model performance.
Anthology ID:
2025.wmt-1.100
Volume:
Proceedings of the Tenth Conference on Machine Translation
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Barry Haddow, Tom Kocmi, Philipp Koehn, Christof Monz
Venue:
WMT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1233–1239
Language:
URL:
https://aclanthology.org/2025.wmt-1.100/
DOI:
Bibkey:
Cite (ACL):
Barathi Ganesh Hb and Michal Ptaszynski. 2025. RBG-AI: Benefits of Multilingual Language Models for Low-Resource Languages. In Proceedings of the Tenth Conference on Machine Translation, pages 1233–1239, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
RBG-AI: Benefits of Multilingual Language Models for Low-Resource Languages (Hb & Ptaszynski, WMT 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.wmt-1.100.pdf