Learning When to Concentrate or Divert Attention: Self-Adaptive Attention Temperature for Neural Machine Translation

Junyang Lin; Xu Sun; Xuancheng Ren; Muyu Li; Qi Su (苏琪, 苏祺, 祺苏,)

doi:10.18653/v1/D18-1331

Learning When to Concentrate or Divert Attention: Self-Adaptive Attention Temperature for Neural Machine Translation

Junyang Lin, Xu Sun, Xuancheng Ren, Muyu Li, Qi Su

Abstract

Most of the Neural Machine Translation (NMT) models are based on the sequence-to-sequence (Seq2Seq) model with an encoder-decoder framework equipped with the attention mechanism. However, the conventional attention mechanism treats the decoding at each time step equally with the same matrix, which is problematic since the softness of the attention for different types of words (e.g. content words and function words) should differ. Therefore, we propose a new model with a mechanism called Self-Adaptive Control of Temperature (SACT) to control the softness of attention by means of an attention temperature. Experimental results on the Chinese-English translation and English-Vietnamese translation demonstrate that our model outperforms the baseline models, and the analysis and the case study show that our model can attend to the most relevant elements in the source-side contexts and generate the translation of high quality.

Anthology ID:: D18-1331
Volume:: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:: October-November
Year:: 2018
Address:: Brussels, Belgium
Editors:: Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
Venue:: EMNLP
SIG:: SIGDAT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2985–2990
Language:
URL:: https://aclanthology.org/D18-1331/
DOI:: 10.18653/v1/D18-1331
Bibkey:
Cite (ACL):: Junyang Lin, Xu Sun, Xuancheng Ren, Muyu Li, and Qi Su. 2018. Learning When to Concentrate or Divert Attention: Self-Adaptive Attention Temperature for Neural Machine Translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2985–2990, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):: Learning When to Concentrate or Divert Attention: Self-Adaptive Attention Temperature for Neural Machine Translation (Lin et al., EMNLP 2018)
Copy Citation:
PDF:: https://aclanthology.org/D18-1331.pdf

PDF Cite Search Fix data