Jointly Masked Sequence-to-Sequence Model for Non-Autoregressive Neural Machine Translation

Junliang Guo; Linli Xu; Enhong Chen

doi:10.18653/v1/2020.acl-main.36

Jointly Masked Sequence-to-Sequence Model for Non-Autoregressive Neural Machine Translation

Abstract

The masked language model has received remarkable attention due to its effectiveness on various natural language processing tasks. However, few works have adopted this technique in the sequence-to-sequence models. In this work, we introduce a jointly masked sequence-to-sequence model and explore its application on non-autoregressive neural machine translation~(NAT). Specifically, we first empirically study the functionalities of the encoder and the decoder in NAT models, and find that the encoder takes a more important role than the decoder regarding the translation quality. Therefore, we propose to train the encoder more rigorously by masking the encoder input while training. As for the decoder, we propose to train it based on the consecutive masking of the decoder input with an n-gram loss function to alleviate the problem of translating duplicate words. The two types of masks are applied to the model jointly at the training stage. We conduct experiments on five benchmark machine translation tasks, and our model can achieve 27.69/32.24 BLEU scores on WMT14 English-German/German-English tasks with 5+ times speed up compared with an autoregressive model.

Anthology ID:: 2020.acl-main.36
Volume:: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:: July
Year:: 2020
Address:: Online
Editors:: Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 376–385
Language:
URL:: https://aclanthology.org/2020.acl-main.36/
DOI:: 10.18653/v1/2020.acl-main.36
Bibkey:
Cite (ACL):: Junliang Guo, Linli Xu, and Enhong Chen. 2020. Jointly Masked Sequence-to-Sequence Model for Non-Autoregressive Neural Machine Translation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 376–385, Online. Association for Computational Linguistics.
Cite (Informal):: Jointly Masked Sequence-to-Sequence Model for Non-Autoregressive Neural Machine Translation (Guo et al., ACL 2020)
Copy Citation:
PDF:: https://aclanthology.org/2020.acl-main.36.pdf
Video:: http://slideslive.com/38928715

PDF Cite Search Video Fix data