Can LLMs Learn From Mistakes? An Empirical Study on Reasoning Tasks

Shengnan An; Zexiong Ma; Siqi Cai; Zeqi Lin; Nanning Zheng; Jian-Guang Lou; Weizhu Chen

Can LLMs Learn From Mistakes? An Empirical Study on Reasoning Tasks

Shengnan An, Zexiong Ma, Siqi Cai, Zeqi Lin, Nanning Zheng, Jian-Guang Lou, Weizhu Chen

Abstract

Towards enhancing the chain-of-thought (CoT) reasoning of large language models (LLMs), much existing work has revealed the effectiveness of straightforward learning on annotated/generated CoT paths. However, there is less evidence yet that reasoning capabilities can be enhanced through a reverse learning process, i.e., learning from potential mistakes in reasoning. To investigate whether LLMs can learn from mistakes, we construct mistake-correction datasets, using GPT-4 to identify and correct the mistakes in inaccurate CoTs. With these mistake-correction datasets, we fine-tune open-source LLMs and arrive at the following conclusions. (1) LLMs can indeed learn from mistakes to enhance their CoT reasoning performances. (2) Compared to CoT data, the mistake-correction data provides additional knowledge on the explanations and reasons for the potential mistakes in CoTs, which consistently contributes to the effectiveness of learning from mistakes. (3) Evolution techniques, especially the correction-centric evolution we introduced, can further enhance the effectiveness of learning from mistakes.

Anthology ID:: 2024.findings-emnlp.46
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2024
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 833–854
Language:
URL:: https://aclanthology.org/2024.findings-emnlp.46
DOI:
Bibkey:
Cite (ACL):: Shengnan An, Zexiong Ma, Siqi Cai, Zeqi Lin, Nanning Zheng, Jian-Guang Lou, and Weizhu Chen. 2024. Can LLMs Learn From Mistakes? An Empirical Study on Reasoning Tasks. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 833–854, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: Can LLMs Learn From Mistakes? An Empirical Study on Reasoning Tasks (An et al., Findings 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.findings-emnlp.46.pdf

PDF Cite Search