Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method Shicheng Tan author Weng Lam Tam author Yuanchun Wang author Wenwen Gong author Shu Zhao author Peng Zhang author Jie Tang author 2023-07 text Findings of the Association for Computational Linguistics: ACL 2023 Anna Rogers editor Jordan Boyd-Graber editor Naoaki Okazaki editor Association for Computational Linguistics Toronto, Canada conference publication tan-etal-2023-intermediate 10.18653/v1/2023.findings-acl.614 https://aclanthology.org/2023.findings-acl.614/ 2023-07 9678 9696