Semantic Decomposition of Question and SQL for Text-to-SQL Parsing

Ben Eyal, Moran Mahabi, Ophir Haroche, Amir Bachar, Michael Elhadad


Abstract
Text-to-SQL semantic parsing faces challenges in generalizing to cross-domain and complex queries. Recent research has employed a question decomposition strategy to enhance the parsing of complex SQL queries.However, this strategy encounters two major obstacles: (1) existing datasets lack question decomposition; (2) due to the syntactic complexity of SQL, most complex queries cannot be disentangled into sub-queries that can be readily recomposed. To address these challenges, we propose a new modular Query Plan Language (QPL) that systematically decomposes SQL queries into simple and regular sub-queries. We develop a translator from SQL to QPL by leveraging analysis of SQL server query optimization plans, and we augment the Spider dataset with QPL programs. Experimental results demonstrate that the modular nature of QPL benefits existing semantic-parsing architectures, and training text-to-QPL parsers is more effective than text-to-SQL parsing for semantically equivalent queries. The QPL approach offers two additional advantages: (1) QPL programs can be paraphrased as simple questions, which allows us to create a dataset of (complex question, decomposed questions). Training on this dataset, we obtain a Question Decomposer for data retrieval that is sensitive to database schemas. (2) QPL is more accessible to non-experts for complex queries, leading to more interpretable output from the semantic parser.
Anthology ID:
2023.findings-emnlp.910
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
13629–13645
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.910
DOI:
10.18653/v1/2023.findings-emnlp.910
Bibkey:
Cite (ACL):
Ben Eyal, Moran Mahabi, Ophir Haroche, Amir Bachar, and Michael Elhadad. 2023. Semantic Decomposition of Question and SQL for Text-to-SQL Parsing. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 13629–13645, Singapore. Association for Computational Linguistics.
Cite (Informal):
Semantic Decomposition of Question and SQL for Text-to-SQL Parsing (Eyal et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-emnlp.910.pdf