How to Distill your BERT: An Empirical Study on the Impact of Weight Initialisation and Distillation Objectives Xinpeng Wang author Leonie Weissweiler author Hinrich Schütze author Barbara Plank author 2023-07 text Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) Anna Rogers editor Jordan Boyd-Graber editor Naoaki Okazaki editor Association for Computational Linguistics Toronto, Canada conference publication wang-etal-2023-distill 10.18653/v1/2023.acl-short.157 https://aclanthology.org/2023.acl-short.157/ 2023-07 1843 1852