Prefix Embeddings for In-context Machine Translation

Suzanna Sia, Kevin Duh


Abstract
Very large language models have been shown to translate with few-shot in-context examples. However, they have not achieved state-of-art results for translating out of English. In this work, we investigate an extremely lightweight fixed-parameter method for conditioning a large language model to better translate into the target language. Our method introduces additional embeddings, known as prefix embeddings which do not interfere with the existing weights of the model. Using unsupervised and weakly semi-supervised methods that train only 0.0001% of the model parameters, the simple method improves ~0.2-1.3 BLEU points across 3 domains and 3 languages. We analyze the resulting embeddings’ training dynamics, and where they lie in the embedding space, and show that our trained embeddings can be used for both in-context translation, and diverse generation of the target sentence.
Anthology ID:
2022.amta-research.4
Volume:
Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)
Month:
September
Year:
2022
Address:
Orlando, USA
Editors:
Kevin Duh, Francisco Guzmán
Venue:
AMTA
SIG:
Publisher:
Association for Machine Translation in the Americas
Note:
Pages:
45–57
Language:
URL:
https://aclanthology.org/2022.amta-research.4
DOI:
Bibkey:
Cite (ACL):
Suzanna Sia and Kevin Duh. 2022. Prefix Embeddings for In-context Machine Translation. In Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Volume 1: Research Track), pages 45–57, Orlando, USA. Association for Machine Translation in the Americas.
Cite (Informal):
Prefix Embeddings for In-context Machine Translation (Sia & Duh, AMTA 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.amta-research.4.pdf
Data
MTNTThe Pile