CoT-Valve: Length-Compressible Chain-of-Thought Tuning

Xinyin Ma; Guangnian Wan; Runpeng Yu; Gongfan Fang; Xinchao Wang

doi:10.18653/v1/2025.acl-long.300

CoT-Valve: Length-Compressible Chain-of-Thought Tuning

Xinyin Ma, Guangnian Wan, Runpeng Yu, Gongfan Fang, Xinchao Wang

Abstract

Chain-of-Thought significantly enhances a model’s reasoning capability, but it also comes with a considerable increase in inference costs due to long chains. With the observation that the reasoning path can be easily compressed under easy tasks but struggle on hard tasks, we explore the feasibility of elastically controlling the length of reasoning paths with only one model, thereby reducing the inference overhead of reasoning models dynamically based on task difficulty. We introduce a new tuning and inference strategy named CoT-Valve, designed to allow models to generate reasoning chains of varying lengths. To achieve this, we propose to identify a direction in the parameter space that, when manipulated, can effectively control the length of generated CoT. Moreover, we show that this property is valuable for compressing the reasoning chain. We construct datasets with chains from long to short for the same questions and explore two enhanced strategies for CoT-Valve: (1) a precise length-compressible CoT tuning method, and (2) a progressive chain length compression approach. Our experiments show that CoT-Valve successfully enables controllability and compressibility of the chain and shows better performance than the prompt-based control. We applied this method to QwQ-32B-Preview, reducing reasoning chains on GSM8K from 741 to 225 tokens with a minor performance drop (95.07% to 94.92%) and on AIME from 6827 to 4629 tokens, with only one additional incorrect answer.

Anthology ID:: 2025.acl-long.300
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 6025–6035
Language:
URL:: https://aclanthology.org/2025.acl-long.300/
DOI:: 10.18653/v1/2025.acl-long.300
Bibkey:
Cite (ACL):: Xinyin Ma, Guangnian Wan, Runpeng Yu, Gongfan Fang, and Xinchao Wang. 2025. CoT-Valve: Length-Compressible Chain-of-Thought Tuning. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6025–6035, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: CoT-Valve: Length-Compressible Chain-of-Thought Tuning (Ma et al., ACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.acl-long.300.pdf

PDF Cite Search Fix data