Gradient-based Intra-attention Pruning on Pre-trained Language Models Ziqing Yang author Yiming Cui author Xin Yao author Shijin Wang author 2023-07 text Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Anna Rogers editor Jordan Boyd-Graber editor Naoaki Okazaki editor Association for Computational Linguistics Toronto, Canada conference publication yang-etal-2023-gradient 10.18653/v1/2023.acl-long.156 https://aclanthology.org/2023.acl-long.156/ 2023-07 2775 2790