Dependency-Based Self-Attention for Transformer NMT

Hiroyuki Deguchi; Akihiro Tamura; Takashi Ninomiya

doi:10.26615/978-954-452-056-4_028

Dependency-Based Self-Attention for Transformer NMT

Hiroyuki Deguchi, Akihiro Tamura, Takashi Ninomiya

Abstract

In this paper, we propose a new Transformer neural machine translation (NMT) model that incorporates dependency relations into self-attention on both source and target sides, dependency-based self-attention. The dependency-based self-attention is trained to attend to the modifiee for each token under constraints based on the dependency relations, inspired by Linguistically-Informed Self-Attention (LISA). While LISA is originally proposed for Transformer encoder for semantic role labeling, this paper extends LISA to Transformer NMT by masking future information on words in the decoder-side dependency-based self-attention. Additionally, our dependency-based self-attention operates at sub-word units created by byte pair encoding. The experiments show that our model improves 1.0 BLEU points over the baseline model on the WAT’18 Asian Scientific Paper Excerpt Corpus Japanese-to-English translation task.

Anthology ID:: R19-1028
Volume:: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
Month:: September
Year:: 2019
Address:: Varna, Bulgaria
Editors:: Ruslan Mitkov, Galia Angelova
Venue:: RANLP
SIG:
Publisher:: INCOMA Ltd.
Note:
Pages:: 239–246
Language:
URL:: https://aclanthology.org/R19-1028/
DOI:: 10.26615/978-954-452-056-4_028
Bibkey:
Cite (ACL):: Hiroyuki Deguchi, Akihiro Tamura, and Takashi Ninomiya. 2019. Dependency-Based Self-Attention for Transformer NMT. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), pages 239–246, Varna, Bulgaria. INCOMA Ltd..
Cite (Informal):: Dependency-Based Self-Attention for Transformer NMT (Deguchi et al., RANLP 2019)
Copy Citation:
PDF:: https://aclanthology.org/R19-1028.pdf

PDF Cite Search Fix data