A Review of Cross-Domain Text-to-SQL Models

Yujian Gan, Matthew Purver, John R. Woodward


Abstract
WikiSQL and Spider, the large-scale cross-domain text-to-SQL datasets, have attracted much attention from the research community. The leaderboards of WikiSQL and Spider show that many researchers propose their models trying to solve the text-to-SQL problem. This paper first divides the top models in these two leaderboards into two paradigms. We then present details not mentioned in their original paper by evaluating the key components, including schema linking, pretrained word embeddings, and reasoning assistance modules. Based on the analysis of these models, we want to promote understanding of the text-to-SQL field and find out some interesting future works, for example, it is worth studying the text-to-SQL problem in an environment where it is more challenging to build schema linking and also worth studying combing the advantage of each model toward text-to-SQL.
Anthology ID:
2020.aacl-srw.16
Volume:
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing: Student Research Workshop
Month:
December
Year:
2020
Address:
Suzhou, China
Venue:
AACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
108–115
Language:
URL:
https://aclanthology.org/2020.aacl-srw.16
DOI:
Bibkey:
Cite (ACL):
Yujian Gan, Matthew Purver, and John R. Woodward. 2020. A Review of Cross-Domain Text-to-SQL Models. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing: Student Research Workshop, pages 108–115, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
A Review of Cross-Domain Text-to-SQL Models (Gan et al., AACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.aacl-srw.16.pdf
Data
WikiSQL