MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers Wenhui Wang author Hangbo Bao author Shaohan Huang author Li Dong author Furu Wei author 2021-08 text Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 Chengqing Zong editor Fei Xia editor Wenjie Li editor Roberto Navigli editor Association for Computational Linguistics Online conference publication wang-etal-2021-minilmv2 10.18653/v1/2021.findings-acl.188 https://aclanthology.org/2021.findings-acl.188/ 2021-08 2140 2151