Towards Being Parameter-Efficient: A Stratified Sparsely Activated Transformer with Dynamic Capacity Haoran Xu author Maha Elbayad author Kenton Murray author Jean Maillard author Vedanuj Goswami author 2023-12 text Findings of the Association for Computational Linguistics: EMNLP 2023 Houda Bouamor editor Juan Pino editor Kalika Bali editor Association for Computational Linguistics Singapore conference publication xu-etal-2023-towards-parameter 10.18653/v1/2023.findings-emnlp.856 https://aclanthology.org/2023.findings-emnlp.856/ 2023-12 12858 12870