ReSQL: Self-Improving Framework for Reasoning-Aware Text-to-SQL Dataset Generation

Minjun Park; Yongju Seong; Myoseop Sim; Kyungkoo Min; Stanley Jungkyu Choi

ReSQL: Self-Improving Framework for Reasoning-Aware Text-to-SQL Dataset Generation

Minjun Park, Yongju Seong, Myoseop Sim, Kyungkoo Min, Stanley Jungkyu Choi

Abstract

Recent advances in Text-to-SQL have greatly benefited from large language models, yet small and medium-sized models still suffer from frequent execution errors and limited self-correction ability. We present ReSQL (Retrieval-augmented error reasoning for Text-to-SQL), a self-improving framework that generates and learns from its own error-reasoning dataset, enabling models to autonomously refine their SQL generation and correction capabilities. ReSQL combines feedback-driven fine-tuning with retrieval-based inference: it gathers model-generated errors, analyzes them through structured feedback prompts, and retrieves relevant correction examples during inference. This unified approach allows models to internalize robust error-reasoning patterns and dynamically apply them to unseen queries. Experimental results on the SPIDER and BIRD benchmarks show that ReSQL substantially improves execution accuracy and self-correction ability over strong baselines, achieving competitive performance with much larger proprietary models such as GPT-4. Our findings highlight ReSQL as a promising step toward self-improving, reasoning-aware Text-to-SQL systems that can continually enhance their reliability and interpretability without external supervision. All code and generated reasoning datasets are available to facilitate application to open-source LLMs and reproducible baseline training.

Anthology ID:: 2026.findings-acl.1677
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 33582–33602
Language:
URL:: https://aclanthology.org/2026.findings-acl.1677/
DOI:
Bibkey:
Cite (ACL):: Minjun Park, Yongju Seong, Myoseop Sim, Kyungkoo Min, and Stanley Jungkyu Choi. 2026. ReSQL: Self-Improving Framework for Reasoning-Aware Text-to-SQL Dataset Generation. In Findings of the Association for Computational Linguistics: ACL 2026, pages 33582–33602, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: ReSQL: Self-Improving Framework for Reasoning-Aware Text-to-SQL Dataset Generation (Park et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.1677.pdf
Checklist:: 2026.findings-acl.1677.checklist.pdf

PDF Cite Search Checklist Fix data