Beyond Many-Shot Translation: Scaling In-Context Demonstrations For Low-Resource Machine Translation

Luis Frentzen Salim; Esteban Carlin; Alexandre Morinvil; Xi Ai; Lun-Wei Ku

Beyond Many-Shot Translation: Scaling In-Context Demonstrations For Low-Resource Machine Translation

Luis Frentzen Salim, Esteban Carlin, Alexandre Morinvil, Xi Ai, Lun-Wei Ku

Abstract

Building machine translation (MT) systems for low-resource languages is notably difficult due to the scarcity of high-quality data. Although Large Language Models (LLMs) have improved MT system performance, adapting them to lesser-represented languages remains challenging. In-context learning (ICL) may offer novel ways to adapt LLMs for low-resource MT by conditioning models on demonstration at inference time. In this study, we explore scaling low-resource machine translation ICL beyond the few-shot setting to thousands of examples with long-context models. We scale in-context token budget to 1M tokens and compare three types of training corpora used as in-context supervision: monolingual unsupervised data, instruction-style data, and parallel data (English–target and Indonesian–target). Our experiments on Javanese and Sundanese show that gains from additional context saturate quickly and can degrade near the maximum context window, with scaling behavior strongly dependent on corpus type. Notably, some forms of monolingual supervision can be competitive with parallel data, despite the latter offering additional supervision. Overall, our results characterize the effective limits and corpus-type sensitivity of long-context ICL for low-resource MT, highlighting that larger context windows do not necessarily yield proportional quality gains.

Anthology ID:: 2026.loreslm-1.34
Volume:: Proceedings of the Second Workshop on Language Models for Low-Resource Languages (LoResLM 2026)
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Hansi Hettiarachchi, Tharindu Ranasinghe, Alistair Plum, Paul Rayson, Ruslan Mitkov, Mohamed Gaber, Damith Premasiri, Fiona Anting Tan, Lasitha Uyangodage
Venue:: LoResLM
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 388–407
Language:
URL:: https://aclanthology.org/2026.loreslm-1.34/
DOI:
Bibkey:
Cite (ACL):: Luis Frentzen Salim, Esteban Carlin, Alexandre Morinvil, Xi Ai, and Lun-Wei Ku. 2026. Beyond Many-Shot Translation: Scaling In-Context Demonstrations For Low-Resource Machine Translation. In Proceedings of the Second Workshop on Language Models for Low-Resource Languages (LoResLM 2026), pages 388–407, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: Beyond Many-Shot Translation: Scaling In-Context Demonstrations For Low-Resource Machine Translation (Salim et al., LoResLM 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.loreslm-1.34.pdf

PDF Cite Search Fix data