Analyzing the Structure of Attention in a Transformer Language Model

Jesse Vig, Yonatan Belinkov


Abstract
The Transformer is a fully attention-based alternative to recurrent networks that has achieved state-of-the-art results across a range of NLP tasks. In this paper, we analyze the structure of attention in a Transformer language model, the GPT-2 small pretrained model. We visualize attention for individual instances and analyze the interaction between attention and syntax over a large corpus. We find that attention targets different parts of speech at different layer depths within the model, and that attention aligns with dependency relations most strongly in the middle layers. We also find that the deepest layers of the model capture the most distant relationships. Finally, we extract exemplar sentences that reveal highly specific patterns targeted by particular attention heads.
Anthology ID:
W19-4808
Volume:
Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Tal Linzen, Grzegorz Chrupała, Yonatan Belinkov, Dieuwke Hupkes
Venue:
BlackboxNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
63–76
Language:
URL:
https://aclanthology.org/W19-4808
DOI:
10.18653/v1/W19-4808
Bibkey:
Cite (ACL):
Jesse Vig and Yonatan Belinkov. 2019. Analyzing the Structure of Attention in a Transformer Language Model. In Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pages 63–76, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Analyzing the Structure of Attention in a Transformer Language Model (Vig & Belinkov, BlackboxNLP 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-4808.pdf
Poster:
 W19-4808.Poster.pdf