CARE-STaR: Constraint-aware Self-taught Reasoner

Zhiliang Li; Bo Tang; Yijun Niu; Beihong Jin; Qiwen Shi; Yuchen Feng; Zhiyu Li; Jie Hu; Mingchuan Yang; Feiyu Xiong

doi:10.18653/v1/2025.findings-acl.1116

CARE-STaR: Constraint-aware Self-taught Reasoner

Zhiliang Li, Bo Tang, Yijun Niu, Beihong Jin, Qiwen Shi, Yuchen Feng, Zhiyu Li, Jie Hu, Mingchuan Yang, Feiyu Xiong

Abstract

In real-world applications, large language models (LLMs) often need to handle diverse and complex instructions. Specifically, when instructions are subject to multiple constraints, some of which are somewhat ambiguous, LLMs often fail to produce answers that satisfy all constraints, limiting their effectiveness in various tasks. To address this challenge, we examine the different constraints in the instructions and discover that the difficulty of determining whether an answer meets a constraint varies widely, from extremely straightforward to exceptionally perplexing. Correspondingly, we propose to assign constraints to different constraint levels. Furthermore, inspired by chain-of-thought (CoT) and self-taught reasoner (STaR), we propose a two-stage method named CARE-STaR (Constraint-AwaRE STaR). Our method distinguishes constraints within instructions by generating different CoTs and guides LLMs to autonomously learn optimal answers by setting the positive rewards to the CoTs that are beneficial to generating accurate answers and iteratively optimizing these answers. We have conducted extensive experiments on three instruction-following benchmarks, taking three existing LLMs as base LLMs, respectively. Experimental results indicate that our method substantially enhances the capability of these LLMs to handle complex instructions, outperforming supervised fine-tuning (SFT). Our code is available at https://github.com/lzl0124/carestar.

Anthology ID:: 2025.findings-acl.1116
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 21689–21703
Language:
URL:: https://aclanthology.org/2025.findings-acl.1116/
DOI:: 10.18653/v1/2025.findings-acl.1116
Bibkey:
Cite (ACL):: Zhiliang Li, Bo Tang, Yijun Niu, Beihong Jin, Qiwen Shi, Yuchen Feng, Zhiyu Li, Jie Hu, Mingchuan Yang, and Feiyu Xiong. 2025. CARE-STaR: Constraint-aware Self-taught Reasoner. In Findings of the Association for Computational Linguistics: ACL 2025, pages 21689–21703, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: CARE-STaR: Constraint-aware Self-taught Reasoner (Li et al., Findings 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.findings-acl.1116.pdf

PDF Cite Search Fix data