IF-CRITIC: Towards a Fine-Grained LLM Critic for Instruction-Following Evaluation

Bosi Wen; Yilin Niu; Cunxiang Wang; Pei Ke; Xiaoying Ling; Ying Zhang; Aohan Zeng; Hongning Wang; Minlie Huang

IF-CRITIC: Towards a Fine-Grained LLM Critic for Instruction-Following Evaluation

Bosi Wen, Yilin Niu, Cunxiang Wang, Pei Ke, Xiaoying Ling, Ying Zhang, Aohan Zeng, Hongning Wang, Minlie Huang

Abstract

Instruction-following is a fundamental ability of Large Language Models (LLMs), requiring their generated outputs to follow multiple constraints imposed in input instructions. Numerous studies have attempted to enhance this ability through preference optimization or reinforcement learning based on reward signals from LLM-as-a-Judge. However, existing evaluation models for instruction-following still possess many deficiencies, such as substantial costs and unreliable assessments. To this end, we propose IF-CRITIC, an LLM critic for fine-grained, efficient, and reliable instruction-following evaluation. We first develop a checklist generator to decompose instructions and generate constraint checklists. With the assistance of the checklists, we collect high-quality critique training data through a multi-stage critique filtering mechanism and employ a constraint-level preference optimization method to train IF-CRITIC. Extensive experiments show that the evaluation performance of IF-CRITIC can beat strong LLM-as-a-Judge baselines, including o4-mini and Gemini-3-Pro. With the reward signals provided by IF-CRITIC, LLMs can achieve substantial performance gains in instruction-following optimization under lowercomputational overhead compared to strong LLM critic baselines. Our code and model are available at https://github.com/thu-coai/IF-CRITIC.

Anthology ID:: 2026.acl-long.1147
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 25004–25029
Language:
URL:: https://aclanthology.org/2026.acl-long.1147/
DOI:
Bibkey:
Cite (ACL):: Bosi Wen, Yilin Niu, Cunxiang Wang, Pei Ke, Xiaoying Ling, Ying Zhang, Aohan Zeng, Hongning Wang, and Minlie Huang. 2026. IF-CRITIC: Towards a Fine-Grained LLM Critic for Instruction-Following Evaluation. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 25004–25029, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: IF-CRITIC: Towards a Fine-Grained LLM Critic for Instruction-Following Evaluation (Wen et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-long.1147.pdf
Checklist:: 2026.acl-long.1147.checklist.pdf

PDF Cite Search Checklist Fix data