Mixture-of-Supernets: Improving Weight-Sharing Supernet Training with Architecture-Routed Mixture-of-Experts Ganesh Jawahar author Haichuan Yang author Yunyang Xiong author Zechun Liu author Dilin Wang author Fei Sun author Meng Li author Aasish Pappu author Barlas Oguz author Muhammad Abdul-Mageed author Laks Lakshmanan author Raghuraman Krishnamoorthi author Vikas Chandra author 2024-08 text Findings of the Association for Computational Linguistics: ACL 2024 Lun-Wei Ku editor Andre Martins editor Vivek Srikumar editor Association for Computational Linguistics Bangkok, Thailand conference publication jawahar-etal-2024-mixture 10.18653/v1/2024.findings-acl.621 https://aclanthology.org/2024.findings-acl.621/ 2024-08 10424 10443