Small Models Struggle to Learn from Strong Reasoners

Yuetai Li; Xiang Yue; Zhangchen Xu; Fengqing Jiang; Luyao Niu; Bill Yuchen Lin; Bhaskar Ramasubramanian; Radha Poovendran

doi:10.18653/v1/2025.findings-acl.1301

Small Models Struggle to Learn from Strong Reasoners

Yuetai Li, Xiang Yue, Zhangchen Xu, Fengqing Jiang, Luyao Niu, Bill Yuchen Lin, Bhaskar Ramasubramanian, Radha Poovendran

Abstract

Large language models (LLMs) excel in complex reasoning tasks, and distilling their reasoning capabilities into smaller models has shown promise. However, we uncover an interesting phenomenon, which we term the Small Model Learnability Gap: small models (3B parameters) do not consistently benefit from long chain-of-thought (CoT) reasoning or distillation from larger models. Instead, they perform better when fine-tuned on shorter, simpler reasoning chains that better align with their intrinsic learning capacity. To address this, we propose Mix Distillation, a simple yet effective strategy that balances reasoning complexity by combining long and short CoT examples or reasoning from both larger and smaller models. Our experiments demonstrate that Mix Distillation significantly improves small model reasoning performance compared to training on either data alone. These findings highlight the limitations of direct strong model distillation and underscore the importance of adapting reasoning complexity for effective reasoning capability transfer.

Anthology ID:: 2025.findings-acl.1301
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 25366–25394
Language:
URL:: https://aclanthology.org/2025.findings-acl.1301/
DOI:: 10.18653/v1/2025.findings-acl.1301
Bibkey:
Cite (ACL):: Yuetai Li, Xiang Yue, Zhangchen Xu, Fengqing Jiang, Luyao Niu, Bill Yuchen Lin, Bhaskar Ramasubramanian, and Radha Poovendran. 2025. Small Models Struggle to Learn from Strong Reasoners. In Findings of the Association for Computational Linguistics: ACL 2025, pages 25366–25394, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Small Models Struggle to Learn from Strong Reasoners (Li et al., Findings 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.findings-acl.1301.pdf

PDF Cite Search Fix data