What It Takes to Achieve 100% Condition Accuracy on WikiSQL

Semih Yavuz, Izzeddin Gur, Yu Su, Xifeng Yan


Abstract
WikiSQL is a newly released dataset for studying the natural language sequence to SQL translation problem. The SQL queries in WikiSQL are simple: Each involves one relation and does not have any join operation. Despite of its simplicity, none of the publicly reported structured query generation models can achieve an accuracy beyond 62%, which is still far from enough for practical use. In this paper, we ask two questions, “Why is the accuracy still low for such simple queries?” and “What does it take to achieve 100% accuracy on WikiSQL?” To limit the scope of our study, we focus on the WHERE clause in SQL. The answers will help us gain insights about the directions we should explore in order to further improve the translation accuracy. We will then investigate alternative solutions to realize the potential ceiling performance on WikiSQL. Our proposed solution can reach up to 88.6% condition accuracy on the WikiSQL dataset.
Anthology ID:
D18-1197
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Editors:
Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
1702–1711
Language:
URL:
https://aclanthology.org/D18-1197
DOI:
10.18653/v1/D18-1197
Bibkey:
Cite (ACL):
Semih Yavuz, Izzeddin Gur, Yu Su, and Xifeng Yan. 2018. What It Takes to Achieve 100% Condition Accuracy on WikiSQL. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1702–1711, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
What It Takes to Achieve 100% Condition Accuracy on WikiSQL (Yavuz et al., EMNLP 2018)
Copy Citation:
PDF:
https://aclanthology.org/D18-1197.pdf
Data
WikiSQL