Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned Elena Voita author David Talbot author Fedor Moiseev author Rico Sennrich author Ivan Titov author 2019-07 text Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics Anna Korhonen editor David Traum editor Lluís Màrquez editor Association for Computational Linguistics Florence, Italy conference publication voita-etal-2019-analyzing 10.18653/v1/P19-1580 https://aclanthology.org/P19-1580/ 2019-07 5797 5808