Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes Cheng-Yu Hsieh author Chun-Liang Li author Chih-kuan Yeh author Hootan Nakhost author Yasuhisa Fujii author Alex Ratner author Ranjay Krishna author Chen-Yu Lee author Tomas Pfister author 2023-07 text Findings of the Association for Computational Linguistics: ACL 2023 Anna Rogers editor Jordan Boyd-Graber editor Naoaki Okazaki editor Association for Computational Linguistics Toronto, Canada conference publication hsieh-etal-2023-distilling 10.18653/v1/2023.findings-acl.507 https://aclanthology.org/2023.findings-acl.507/ 2023-07 8003 8017