Incorporating Source Syntax into Transformer-Based Neural Machine Translation

Anna Currey, Kenneth Heafield


Abstract
Transformer-based neural machine translation (NMT) has recently achieved state-of-the-art performance on many machine translation tasks. However, recent work (Raganato and Tiedemann, 2018; Tang et al., 2018; Tran et al., 2018) has indicated that Transformer models may not learn syntactic structures as well as their recurrent neural network-based counterparts, particularly in low-resource cases. In this paper, we incorporate constituency parse information into a Transformer NMT model. We leverage linearized parses of the source training sentences in order to inject syntax into the Transformer architecture without modifying it. We introduce two methods: a multi-task machine translation and parsing model with a single encoder and decoder, and a mixed encoder model that learns to translate directly from parsed and unparsed source sentences. We evaluate our methods on low-resource translation from English into twenty target languages, showing consistent improvements of 1.3 BLEU on average across diverse target languages for the multi-task technique. We further evaluate the models on full-scale WMT tasks, finding that the multi-task model aids low- and medium-resource NMT but degenerates high-resource English-German translation.
Anthology ID:
W19-5203
Volume:
Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers)
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, André Martins, Christof Monz, Matteo Negri, Aurélie Névéol, Mariana Neves, Matt Post, Marco Turchi, Karin Verspoor
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
24–33
Language:
URL:
https://aclanthology.org/W19-5203/
DOI:
10.18653/v1/W19-5203
Bibkey:
Cite (ACL):
Anna Currey and Kenneth Heafield. 2019. Incorporating Source Syntax into Transformer-Based Neural Machine Translation. In Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers), pages 24–33, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Incorporating Source Syntax into Transformer-Based Neural Machine Translation (Currey & Heafield, WMT 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-5203.pdf