Language Model Prior for Low-Resource Neural Machine Translation

Christos Baziotis; Barry Haddow; Alexandra Birch

doi:10.18653/v1/2020.emnlp-main.615

Language Model Prior for Low-Resource Neural Machine Translation

Christos Baziotis, Barry Haddow, Alexandra Birch

Abstract

The scarcity of large parallel corpora is an important obstacle for neural machine translation. A common solution is to exploit the knowledge of language models (LM) trained on abundant monolingual data. In this work, we propose a novel approach to incorporate a LM as prior in a neural translation model (TM). Specifically, we add a regularization term, which pushes the output distributions of the TM to be probable under the LM prior, while avoiding wrong predictions when the TM “disagrees” with the LM. This objective relates to knowledge distillation, where the LM can be viewed as teaching the TM about the target language. The proposed approach does not compromise decoding speed, because the LM is used only at training time, unlike previous work that requires it during inference. We present an analysis of the effects that different methods have on the distributions of the TM. Results on two low-resource machine translation datasets show clear improvements even with limited monolingual data.

Anthology ID:: 2020.emnlp-main.615
Original:: 2020.emnlp-main.615v1
Version 2:: 2020.emnlp-main.615v2
Volume:: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Month:: November
Year:: 2020
Address:: Online
Editors:: Bonnie Webber, Trevor Cohn, Yulan He, Yang Liu
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 7622–7634
Language:
URL:: https://aclanthology.org/2020.emnlp-main.615/
DOI:: 10.18653/v1/2020.emnlp-main.615
Bibkey:
Cite (ACL):: Christos Baziotis, Barry Haddow, and Alexandra Birch. 2020. Language Model Prior for Low-Resource Neural Machine Translation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7622–7634, Online. Association for Computational Linguistics.
Cite (Informal):: Language Model Prior for Low-Resource Neural Machine Translation (Baziotis et al., EMNLP 2020)
Copy Citation:
PDF:: https://aclanthology.org/2020.emnlp-main.615.pdf
Video:: https://slideslive.com/38938725

PDF (v2) PDF (v1) Cite Search Video Fix data