In-context Examples Selection for Machine Translation

Sweta Agrawal, Chunting Zhou, Mike Lewis, Luke Zettlemoyer, Marjan Ghazvininejad


Abstract
Large-scale generative models show an impressive ability to perform a wide range of Natural Language Processing (NLP) tasks using in-context learning, where a few examples are used to describe a task to the model. For Machine Translation (MT), these examples are typically randomly sampled from the development dataset with a similar distribution as the evaluation set. However, it is unclear how the choice of these in context examples and their ordering impacts the output translation quality. In this work, we aim to understand the properties of good in-context examples for MT in both in-domain and out-of-domain settings. We show that the translation quality and the domain of the in-context examples matter and that 1-shot noisy unrelated examples can have a catastrophic impact on output quality. While concatenating multiple random examples reduces the effect of noise, a single good prompt optimized to maximize translation quality on the development dataset can elicit learned information from the pre-trained language model. Adding similar examples based on an n-gram overlap with the test source significantly and consistently improves the translation quality of the outputs, outperforming a strong kNN-MT baseline in 2 out of 4 out-of-domain datasets.
Anthology ID:
2023.findings-acl.564
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8857–8873
Language:
URL:
https://aclanthology.org/2023.findings-acl.564
DOI:
10.18653/v1/2023.findings-acl.564
Bibkey:
Cite (ACL):
Sweta Agrawal, Chunting Zhou, Mike Lewis, Luke Zettlemoyer, and Marjan Ghazvininejad. 2023. In-context Examples Selection for Machine Translation. In Findings of the Association for Computational Linguistics: ACL 2023, pages 8857–8873, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
In-context Examples Selection for Machine Translation (Agrawal et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-acl.564.pdf