%0 Conference Proceedings %T SwapMoE: Serving Off-the-shelf MoE-based Large Language Models with Tunable Memory Budget %A Kong, Rui %A Li, Yuanchun %A Feng, Qingtian %A Wang, Weijun %A Ye, Xiaozhou %A Ouyang, Ye %A Kong, Linghe %A Liu, Yunxin %Y Ku, Lun-Wei %Y Martins, Andre %Y Srikumar, Vivek %S Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) %D 2024 %8 August %I Association for Computational Linguistics %C Bangkok, Thailand %F kong-etal-2024-swapmoe %R 10.18653/v1/2024.acl-long.363 %U https://aclanthology.org/2024.acl-long.363/ %U https://doi.org/10.18653/v1/2024.acl-long.363 %P 6710-6720