Code Reffix: A Benchmark for Reflection-Guided Code Repair with Large Language Models

Zaiyuan Di; Jianting Chen; Yunxiao Yang; Xiaoying Gao (高晓影); Li Yang; Zhihao Wang; Yang Xiang

Code Reffix: A Benchmark for Reflection-Guided Code Repair with Large Language Models

Zaiyuan Di, Jianting Chen, Yunxiao Yang, Xiaoying Gao, Li Yang, Zhihao Wang, Yang Xiang

Abstract

While recent studies have increasingly emphasized the role of reflection in code repair tasks, existing benchmarks still target the repair generation capability of LLMs, lacking fine-grained evaluation of reflection generation capability. To this end, we propose Code Reffix, a benchmark featuring an automated pipeline with oracle reflections and a dual-task protocol to decouple the evaluation of reflection from repair. Through extensive experiments on 14 LLMs and fine-tuning analysis, we aim to pinpoint performance bottlenecks of code repair, quantify reflection quality, and verify the value of reflection optimization. Evaluations reveal that underperforming reflection capabilities of small-scale LLMs remain a major bottleneck for code repair. By quantifying this gap, Code Reffix provides a critical foundation for optimizing LLMs to achieve superior repair performance.

Anthology ID:: 2026.findings-acl.762
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 15541–15561
Language:
URL:: https://aclanthology.org/2026.findings-acl.762/
DOI:
Bibkey:
Cite (ACL):: Zaiyuan Di, Jianting Chen, Yunxiao Yang, Xiaoying Gao, Li Yang, Zhihao Wang, and Yang Xiang. 2026. Code Reffix: A Benchmark for Reflection-Guided Code Repair with Large Language Models. In Findings of the Association for Computational Linguistics: ACL 2026, pages 15541–15561, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Code Reffix: A Benchmark for Reflection-Guided Code Repair with Large Language Models (Di et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.762.pdf
Checklist:: 2026.findings-acl.762.checklist.pdf

PDF Cite Search Checklist Fix data