ColloQL: Robust Text-to-SQL Over Search Queries

Karthik Radhakrishnan; Arvind Srikantan; Xi Victoria Lin

doi:10.18653/v1/2020.intexsempar-1.5

ColloQL: Robust Text-to-SQL Over Search Queries

Karthik Radhakrishnan, Arvind Srikantan, Xi Victoria Lin

Abstract

Translating natural language utterances to executable queries is a helpful technique in making the vast amount of data stored in relational databases accessible to a wider range of non-tech-savvy end users. Prior work in this area has largely focused on textual input that is linguistically correct and semantically unambiguous. However, real-world user queries are often succinct, colloquial, and noisy, resembling the input of a search engine. In this work, we introduce data augmentation techniques and a sampling-based content-aware BERT model (ColloQL) to achieve robust text-to-SQL modeling over natural language search (NLS) questions. Due to the lack of evaluation data, we curate a new dataset of NLS questions and demonstrate the efficacy of our approach. ColloQL’s superior performance extends to well-formed text, achieving an 84.9% (logical) and 90.7% (execution) accuracy on the WikiSQL dataset, making it, to the best of our knowledge, the highest performing model that does not use execution guided decoding.

Anthology ID:: 2020.intexsempar-1.5
Volume:: Proceedings of the First Workshop on Interactive and Executable Semantic Parsing
Month:: November
Year:: 2020
Address:: Online
Editors:: Ben Bogin, Srinivasan Iyer, Xi Victoria Lin, Dragomir Radev, Alane Suhr, Panupong, Caiming Xiong, Pengcheng Yin, Tao Yu, Rui Zhang, Victor Zhong
Venue:: intexsempar
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 34–45
Language:
URL:: https://aclanthology.org/2020.intexsempar-1.5/
DOI:: 10.18653/v1/2020.intexsempar-1.5
Bibkey:
Cite (ACL):: Karthik Radhakrishnan, Arvind Srikantan, and Xi Victoria Lin. 2020. ColloQL: Robust Text-to-SQL Over Search Queries. In Proceedings of the First Workshop on Interactive and Executable Semantic Parsing, pages 34–45, Online. Association for Computational Linguistics.
Cite (Informal):: ColloQL: Robust Text-to-SQL Over Search Queries (Radhakrishnan et al., intexsempar 2020)
Copy Citation:
PDF:: https://aclanthology.org/2020.intexsempar-1.5.pdf
Video:: https://slideslive.com/38939457

PDF Cite Search Video Fix data