Teaching Small Language Models Reasoning through Counterfactual Distillation

Tao Feng, Yicheng Li, Li Chenglin, Hao Chen, Fei Yu, Yin Zhang


Abstract
With the rise of large language models (LLMs), many studies are interested in transferring the reasoning capabilities of LLMs to small language models (SLMs). Previous distillation methods usually utilize the capabilities of LLMs to generate chain-of-thought (CoT) samples and teach SLMs via fine-tuning. However, such a standard distillation approach performs poorly when applied to out-of-distribution (OOD) examples, and the diversity of the generated CoT samples is insufficient. In this work, we propose a novel counterfactual distillation framework. Firstly, we leverage LLMs to automatically generate high-quality counterfactual data. Given an input text example, our method generates a counterfactual example that is very similar to the original input, but its task label has been changed to the desired one. Then, we utilize multi-view CoT to enhance the diversity of reasoning samples. Experiments on four NLP benchmarks show that our approach enhances the reasoning capabilities of SLMs and is more robust to OOD data. We also conduct extensive ablations and sample studies to understand the reasoning capabilities of SLMs.
Anthology ID:
2024.emnlp-main.333
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5831–5842
Language:
URL:
https://aclanthology.org/2024.emnlp-main.333
DOI:
Bibkey:
Cite (ACL):
Tao Feng, Yicheng Li, Li Chenglin, Hao Chen, Fei Yu, and Yin Zhang. 2024. Teaching Small Language Models Reasoning through Counterfactual Distillation. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 5831–5842, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Teaching Small Language Models Reasoning through Counterfactual Distillation (Feng et al., EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.333.pdf