Information Aggregation for Multi-Head Attention with Routing-by-Agreement Jian Li author Baosong Yang author Zi-Yi Dou author Xing Wang author Michael R Lyu author Zhaopeng Tu author 2019-06 text Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) Jill Burstein editor Christy Doran editor Thamar Solorio editor Association for Computational Linguistics Minneapolis, Minnesota conference publication li-etal-2019-information 10.18653/v1/N19-1359 https://aclanthology.org/N19-1359/ 2019-06 3566 3575