Domain Adaptation for NMT via Filtered Iterative Back-Translation

Surabhi Kumari, Nikhil Jaiswal, Mayur Patidar, Manasi Patwardhan, Shirish Karande, Puneet Agarwal, Lovekesh Vig


Abstract
Domain-specific Neural Machine Translation (NMT) model can provide improved performance, however, it is difficult to always access a domain-specific parallel corpus. Iterative Back-Translation can be used for fine-tuning an NMT model for a domain even if only a monolingual domain corpus is available. The quality of synthetic parallel corpora in terms of closeness to in-domain sentences can play an important role in the performance of the translation model. Recent works have shown that filtering at different stages of the back translation and weighting the sentences can provide state-of-the-art performance. In comparison, in this work, we observe that a simpler filtering approach based on a domain classifier, applied only to the pseudo-training data can consistently perform better, providing performance gains of 1.40, 1.82 and 0.76 in terms of BLEU score for Medical, Law and IT in one direction, and 1.28, 1.60 and 1.60 in the other direction in low resource scenario over competitive baselines. In the high resource scenario, our approach is at par with competitive baselines.
Anthology ID:
2021.adaptnlp-1.26
Volume:
Proceedings of the Second Workshop on Domain Adaptation for NLP
Month:
April
Year:
2021
Address:
Kyiv, Ukraine
Editors:
Eyal Ben-David, Shay Cohen, Ryan McDonald, Barbara Plank, Roi Reichart, Guy Rotman, Yftah Ziser
Venue:
AdaptNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
263–271
Language:
URL:
https://aclanthology.org/2021.adaptnlp-1.26
DOI:
Bibkey:
Cite (ACL):
Surabhi Kumari, Nikhil Jaiswal, Mayur Patidar, Manasi Patwardhan, Shirish Karande, Puneet Agarwal, and Lovekesh Vig. 2021. Domain Adaptation for NMT via Filtered Iterative Back-Translation. In Proceedings of the Second Workshop on Domain Adaptation for NLP, pages 263–271, Kyiv, Ukraine. Association for Computational Linguistics.
Cite (Informal):
Domain Adaptation for NMT via Filtered Iterative Back-Translation (Kumari et al., AdaptNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.adaptnlp-1.26.pdf