MSc-SQL: Multi-Sample Critiquing Small Language Models For Text-To-SQL Translation

Satya Krishna Gorti; Ilan Gofman; Zhaoyan Liu; Jiapeng Wu; Noël Vouitsis; Guangwei Yu; Jesse C. Cresswell; Rasa Hosseinzadeh

doi:10.18653/v1/2025.naacl-long.107

MSc-SQL: Multi-Sample Critiquing Small Language Models For Text-To-SQL Translation

Satya Krishna Gorti, Ilan Gofman, Zhaoyan Liu, Jiapeng Wu, Noël Vouitsis, Guangwei Yu, Jesse C. Cresswell, Rasa Hosseinzadeh

Abstract

Text-to-SQL generation enables non-experts to interact with databases via natural language. Recent advances rely on large closed-source models like GPT-4 that present challenges in accessibility, privacy, and latency. To address these issues, we focus on developing small, efficient, and open-source text-to-SQL models. We demonstrate the benefits of sampling multiple candidate SQL generations and propose our method, MSc-SQL, to critique them using associated metadata. Our sample critiquing model evaluates multiple outputs simultaneously, achieving state-of-the-art performance compared to other open-source models while remaining competitive with larger models at a much lower cost. Full code can be found at github.com/layer6ai-labs/msc-sql.

Anthology ID:: 2025.naacl-long.107
Volume:: Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:: April
Year:: 2025
Address:: Albuquerque, New Mexico
Editors:: Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2145–2160
Language:
URL:: https://aclanthology.org/2025.naacl-long.107/
DOI:: 10.18653/v1/2025.naacl-long.107
Bibkey:
Cite (ACL):: Satya Krishna Gorti, Ilan Gofman, Zhaoyan Liu, Jiapeng Wu, Noël Vouitsis, Guangwei Yu, Jesse C. Cresswell, and Rasa Hosseinzadeh. 2025. MSc-SQL: Multi-Sample Critiquing Small Language Models For Text-To-SQL Translation. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 2145–2160, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):: MSc-SQL: Multi-Sample Critiquing Small Language Models For Text-To-SQL Translation (Gorti et al., NAACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.naacl-long.107.pdf

PDF Cite Search Fix data