LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding Mostafa Elhoushi author Akshat Shrivastava author Diana Liskovich author Basil Hosmer author Bram Wasti author Liangzhen Lai author Anas Mahmoud author Bilge Acun author Saurabh Agarwal author Ahmed Roman author Ahmed Aly author Beidi Chen author Carole-Jean Wu author 2024-08 text Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Lun-Wei Ku editor Andre Martins editor Vivek Srikumar editor Association for Computational Linguistics Bangkok, Thailand conference publication elhoushi-etal-2024-layerskip 10.18653/v1/2024.acl-long.681 https://aclanthology.org/2024.acl-long.681/ 2024-08 12622 12642