Pruning Large Language Models to Intra-module Low-rank Architecture with Transitional Activations Bowen Shen author Zheng Lin author Daren Zha author Wei Liu author Jian Luan author Bin Wang author Weiping Wang author 2024-08 text Findings of the Association for Computational Linguistics: ACL 2024 Lun-Wei Ku editor Andre Martins editor Vivek Srikumar editor Association for Computational Linguistics Bangkok, Thailand conference publication shen-etal-2024-pruning 10.18653/v1/2024.findings-acl.582 https://aclanthology.org/2024.findings-acl.582/ 2024-08 9781 9793