%0 Conference Proceedings %T DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models %A Dai, Damai %A Deng, Chengqi %A Zhao, Chenggang %A Xu, R.x. %A Gao, Huazuo %A Chen, Deli %A Li, Jiashi %A Zeng, Wangding %A Yu, Xingkai %A Wu, Y. %A Xie, Zhenda %A Li, Y.k. %A Huang, Panpan %A Luo, Fuli %A Ruan, Chong %A Sui, Zhifang %A Liang, Wenfeng %Y Ku, Lun-Wei %Y Martins, Andre %Y Srikumar, Vivek %S Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) %D 2024 %8 August %I Association for Computational Linguistics %C Bangkok, Thailand %F dai-etal-2024-deepseekmoe %R 10.18653/v1/2024.acl-long.70 %U https://aclanthology.org/2024.acl-long.70/ %U https://doi.org/10.18653/v1/2024.acl-long.70 %P 1280-1297