In-Context Example Selection via Similarity Search Improves Low-Resource Machine Translation

Armel Randy Zebaze; Benoît Sagot; Rachel Bawden

doi:10.18653/v1/2025.findings-naacl.68

In-Context Example Selection via Similarity Search Improves Low-Resource Machine Translation

Armel Randy Zebaze, Benoît Sagot, Rachel Bawden

Abstract

The ability of generative large language models (LLMs) to perform in-context learning has given rise to a large body of research into how best to prompt models for various natural language processing tasks. In this paper, we focus on machine translation (MT), a task that has been shown to benefit from in-context translation examples. However no systematic studies have been published on how best to select examples, and mixed results have been reported on the usefulness of similarity-based selection over random selection, although these results have mainly been shown for high-resource languages only. We provide a study covering multiple LLMs and in-context example retrieval strategies. Contrarily to previously published results, we find that retrieval based on sentence embedding similarity can improve MT, especially for low-resource language directions, and we also discuss the balance between selection pool diversity and quality. Code and outputs will be made freely available.

Anthology ID:: 2025.findings-naacl.68
Volume:: Findings of the Association for Computational Linguistics: NAACL 2025
Month:: April
Year:: 2025
Address:: Albuquerque, New Mexico
Editors:: Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1222–1252
Language:
URL:: https://aclanthology.org/2025.findings-naacl.68/
DOI:: 10.18653/v1/2025.findings-naacl.68
Bibkey:
Cite (ACL):: Armel Randy Zebaze, Benoît Sagot, and Rachel Bawden. 2025. In-Context Example Selection via Similarity Search Improves Low-Resource Machine Translation. In Findings of the Association for Computational Linguistics: NAACL 2025, pages 1222–1252, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):: In-Context Example Selection via Similarity Search Improves Low-Resource Machine Translation (Zebaze et al., Findings 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.findings-naacl.68.pdf

PDF Cite Search Fix data