Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models Xudong Lu author Qi Liu author Yuhui Xu author Aojun Zhou author Siyuan Huang author Bo Zhang author Junchi Yan author Hongsheng Li author 2024-08 text Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Lun-Wei Ku editor Andre Martins editor Vivek Srikumar editor Association for Computational Linguistics Bangkok, Thailand conference publication lu-etal-2024-experts 10.18653/v1/2024.acl-long.334 https://aclanthology.org/2024.acl-long.334/ 2024-08 6159 6172