Unleashing the Reasoning Potential of LLMs by Critique Fine-Tuning on One Problem

Yubo Wang; Ping Nie; Kai Zou; Lijun Wu; Wenhu Chen

doi:10.18653/v1/2025.emnlp-main.149

Unleashing the Reasoning Potential of LLMs by Critique Fine-Tuning on One Problem

Yubo Wang, Ping Nie, Kai Zou, Lijun Wu, Wenhu Chen

Abstract

Critique Fine-Tuning (CFT) has recently emerged as a promising paradigm for unlocking the reasoning capabilities of large language models (LLMs). In this work, we introduce one-shot CFT, a highly compute-efficient approach that leverages critique data generated from a single math problem. Remarkably, this method yields significant gains in reasoning accuracy, surpassing one-shot RLVR (Reinforcement Learning with Verifiable Reward) while requiring 15 to 20 times less compute. Given one math problem, we first prompt a set of diverse small models to produce candidate solutions, then use frontier models such as GPT-4.1 to generate high-quality critiques of these responses. We fine-tune Qwen and Llama family models ranging from 1.5B to 14B parameters with CFT. With just 5 GPU hours, our models achieve up to a 16 percent absolute improvement in average accuracy across six mathematical reasoning benchmarks (for example, Qwen2.5-Math-7B improves from 26 percent to 42 percent). Furthermore, ablation studies reveal the robustness of one-shot CFT across different prompt problems. Our findings suggest an extremely compute-efficient approach to unleash the reasoning potential of LLMs.

Anthology ID:: 2025.emnlp-main.149
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3017–3027
Language:
URL:: https://aclanthology.org/2025.emnlp-main.149/
DOI:: 10.18653/v1/2025.emnlp-main.149
Bibkey:
Cite (ACL):: Yubo Wang, Ping Nie, Kai Zou, Lijun Wu, and Wenhu Chen. 2025. Unleashing the Reasoning Potential of LLMs by Critique Fine-Tuning on One Problem. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 3017–3027, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Unleashing the Reasoning Potential of LLMs by Critique Fine-Tuning on One Problem (Wang et al., EMNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.emnlp-main.149.pdf
Checklist:: 2025.emnlp-main.149.checklist.pdf

PDF Cite Search Checklist Fix data