Improving Retrieval Augmented Neural Machine Translation by Controlling Source and Fuzzy-Match Interactions

Cuong Hoang; Devendra Sachan; Prashant Mathur; Brian Thompson; Marcello Federico

doi:10.18653/v1/2023.findings-eacl.22

Improving Retrieval Augmented Neural Machine Translation by Controlling Source and Fuzzy-Match Interactions

Cuong Hoang, Devendra Sachan, Prashant Mathur, Brian Thompson, Marcello Federico

Abstract

We explore zero-shot adaptation, where a general-domain model has access to customer or domain specific parallel data at inference time, but not during training. We build on the idea of Retrieval Augmented Translation (RAT) where top-k in-domain fuzzy matches are found for the source sentence, and target-language translations of those fuzzy-matched sentences are provided to the translation model at inference time. We propose a novel architecture to control interactions between a source sentence and the top-k fuzzy target-language matches, and compare it to architectures from prior work. We conduct experiments in two language pairs (En-De and En-Fr) by training models on WMT data and testing them with five and seven multi-domain datasets, respectively. Our approach consistently outperforms the alternative architectures, improving BLEU across language pair, domain, and number k of fuzzy matches.

Anthology ID:: 2023.findings-eacl.22
Volume:: Findings of the Association for Computational Linguistics: EACL 2023
Month:: May
Year:: 2023
Address:: Dubrovnik, Croatia
Editors:: Andreas Vlachos, Isabelle Augenstein
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 289–295
Language:
URL:: https://aclanthology.org/2023.findings-eacl.22
DOI:: 10.18653/v1/2023.findings-eacl.22
Bibkey:
Cite (ACL):: Cuong Hoang, Devendra Sachan, Prashant Mathur, Brian Thompson, and Marcello Federico. 2023. Improving Retrieval Augmented Neural Machine Translation by Controlling Source and Fuzzy-Match Interactions. In Findings of the Association for Computational Linguistics: EACL 2023, pages 289–295, Dubrovnik, Croatia. Association for Computational Linguistics.
Cite (Informal):: Improving Retrieval Augmented Neural Machine Translation by Controlling Source and Fuzzy-Match Interactions (Hoang et al., Findings 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.findings-eacl.22.pdf
Video:: https://aclanthology.org/2023.findings-eacl.22.mp4

PDF Cite Search Video