MoEfication: Transformer Feed-forward Layers are Mixtures of Experts Zhengyan Zhang author Yankai Lin author Zhiyuan Liu author Peng Li author Maosong Sun author Jie Zhou author 2022-05 text Findings of the Association for Computational Linguistics: ACL 2022 Smaranda Muresan editor Preslav Nakov editor Aline Villavicencio editor Association for Computational Linguistics Dublin, Ireland conference publication zhang-etal-2022-moefication 10.18653/v1/2022.findings-acl.71 https://aclanthology.org/2022.findings-acl.71/ 2022-05 877 890