Correcting Length Bias in Neural Machine Translation

Kenton Murray, David Chiang


Abstract
We study two problems in neural machine translation (NMT). First, in beam search, whereas a wider beam should in principle help translation, it often hurts NMT. Second, NMT has a tendency to produce translations that are too short. Here, we argue that these problems are closely related and both rooted in label bias. We show that correcting the brevity problem almost eliminates the beam problem; we compare some commonly-used methods for doing this, finding that a simple per-word reward works well; and we introduce a simple and quick way to tune this reward using the perceptron algorithm.
Anthology ID:
W18-6322
Volume:
Proceedings of the Third Conference on Machine Translation: Research Papers
Month:
October
Year:
2018
Address:
Brussels, Belgium
Editors:
Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Christof Monz, Matteo Negri, Aurélie Névéol, Mariana Neves, Matt Post, Lucia Specia, Marco Turchi, Karin Verspoor
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
212–223
Language:
URL:
https://aclanthology.org/W18-6322
DOI:
10.18653/v1/W18-6322
Bibkey:
Cite (ACL):
Kenton Murray and David Chiang. 2018. Correcting Length Bias in Neural Machine Translation. In Proceedings of the Third Conference on Machine Translation: Research Papers, pages 212–223, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Correcting Length Bias in Neural Machine Translation (Murray & Chiang, WMT 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-6322.pdf