Few-Shot Learning Translation from New Languages

Carlos Mullov; Alex Waibel

doi:10.18653/v1/2025.emnlp-main.163

Few-Shot Learning Translation from New Languages

Abstract

Recent work shows strong transfer learning capability to unseen languages in sequence-to-sequence neural networks, under the assumption that we have high-quality word representations for the target language. We evaluate whether this direction is a viable path forward for translation from low-resource languages by investigating how much data is required to learn such high-quality word representations. We first show that learning word embeddings separately from a translation model can enable rapid adaptation to new languages with only a few hundred sentences of parallel data. To see whether the current bottleneck in transfer to low-resource languages lies mainly with learning the word representations, we then train word embeddings models on varying amounts of data, to then plug them into a machine translation model. We show that in this simulated low-resource setting with only 500 parallel sentences and 31,250 sentences of monolingual data we can exceed 15 BLEU on Flores on unseen languages. Finally, we investigate why on a real low-resource language the results are less favorable and find fault with the publicly available multilingual language modelling datasets.

Anthology ID:: 2025.emnlp-main.163
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3309–3330
Language:
URL:: https://aclanthology.org/2025.emnlp-main.163/
DOI:: 10.18653/v1/2025.emnlp-main.163
Bibkey:
Cite (ACL):: Carlos Mullov and Alexander Waibel. 2025. Few-Shot Learning Translation from New Languages. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 3309–3330, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Few-Shot Learning Translation from New Languages (Mullov & Waibel, EMNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.emnlp-main.163.pdf
Checklist:: 2025.emnlp-main.163.checklist.pdf

PDF Cite Search Checklist Fix data