Neural Fuzzy Repair: Integrating Fuzzy Matches into Neural Machine Translation

Bram Bulte, Arda Tezcan


Abstract
We present a simple yet powerful data augmentation method for boosting Neural Machine Translation (NMT) performance by leveraging information retrieved from a Translation Memory (TM). We propose and test two methods for augmenting NMT training data with fuzzy TM matches. Tests on the DGT-TM data set for two language pairs show consistent and substantial improvements over a range of baseline systems. The results suggest that this method is promising for any translation environment in which a sizeable TM is available and a certain amount of repetition across translations is to be expected, especially considering its ease of implementation.
Anthology ID:
P19-1175
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Editors:
Anna Korhonen, David Traum, Lluís Màrquez
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1800–1809
Language:
URL:
https://aclanthology.org/P19-1175
DOI:
10.18653/v1/P19-1175
Bibkey:
Cite (ACL):
Bram Bulte and Arda Tezcan. 2019. Neural Fuzzy Repair: Integrating Fuzzy Matches into Neural Machine Translation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1800–1809, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Neural Fuzzy Repair: Integrating Fuzzy Matches into Neural Machine Translation (Bulte & Tezcan, ACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/P19-1175.pdf
Video:
 https://aclanthology.org/P19-1175.mp4