Roles and Utilization of Attention Heads in Transformer-based Neural Language Models Jae-young Jo author Sung-Hyon Myaeng author 2020-07 text Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics Dan Jurafsky editor Joyce Chai editor Natalie Schluter editor Joel Tetreault editor Association for Computational Linguistics Online conference publication jo-myaeng-2020-roles 10.18653/v1/2020.acl-main.311 https://aclanthology.org/2020.acl-main.311/ 2020-07 3404 3417