An Analysis of Encoder Representations in Transformer-Based Machine Translation

Alessandro Raganato, Jörg Tiedemann


Abstract
The attention mechanism is a successful technique in modern NLP, especially in tasks like machine translation. The recently proposed network architecture of the Transformer is based entirely on attention mechanisms and achieves new state of the art results in neural machine translation, outperforming other sequence-to-sequence models. However, so far not much is known about the internal properties of the model and the representations it learns to achieve that performance. To study this question, we investigate the information that is learned by the attention mechanism in Transformer models with different translation quality. We assess the representations of the encoder by extracting dependency relations based on self-attention weights, we perform four probing tasks to study the amount of syntactic and semantic captured information and we also test attention in a transfer learning scenario. Our analysis sheds light on the relative strengths and weaknesses of the various encoder representations. We observe that specific attention heads mark syntactic dependency relations and we can also confirm that lower layers tend to learn more about syntax while higher layers tend to encode more semantics.
Anthology ID:
W18-5431
Volume:
Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP
Month:
November
Year:
2018
Address:
Brussels, Belgium
Editors:
Tal Linzen, Grzegorz Chrupała, Afra Alishahi
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
287–297
Language:
URL:
https://aclanthology.org/W18-5431/
DOI:
10.18653/v1/W18-5431
Bibkey:
Cite (ACL):
Alessandro Raganato and Jörg Tiedemann. 2018. An Analysis of Encoder Representations in Transformer-Based Machine Translation. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pages 287–297, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
An Analysis of Encoder Representations in Transformer-Based Machine Translation (Raganato & Tiedemann, EMNLP 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-5431.pdf