TrojanSQL: SQL Injection against Natural Language Interface to Database

Jinchuan Zhang; Yan Zhou; Binyuan Hui; Yaxin Liu; Ziming Li; Songlin Hu

doi:10.18653/v1/2023.emnlp-main.264

TrojanSQL: SQL Injection against Natural Language Interface to Database

Jinchuan Zhang, Yan Zhou, Binyuan Hui, Yaxin Liu, Ziming Li, Songlin Hu

Abstract

The technology of text-to-SQL has significantly enhanced the efficiency of accessing and manipulating databases. However, limited research has been conducted to study its vulnerabilities emerging from malicious user interaction. By proposing TrojanSQL, a backdoor-based SQL injection framework for text-to-SQL systems, we show how state-of-the-art text-to-SQL parsers can be easily misled to produce harmful SQL statements that can invalidate user queries or compromise sensitive information about the database. The study explores two specific injection attacks, namely boolean-based injection and union-based injection, which use different types of triggers to achieve distinct goals in compromising the parser. Experimental results demonstrate that both medium-sized models based on fine-tuning and LLM-based parsers using prompting techniques are vulnerable to this type of attack, with attack success rates as high as 99% and 89%, respectively. We hope that this study will raise more concerns about the potential security risks of building natural language interfaces to databases.

Anthology ID:: 2023.emnlp-main.264
Volume:: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Houda Bouamor, Juan Pino, Kalika Bali
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4344–4359
Language:
URL:: https://aclanthology.org/2023.emnlp-main.264/
DOI:: 10.18653/v1/2023.emnlp-main.264
Bibkey:
Cite (ACL):: Jinchuan Zhang, Yan Zhou, Binyuan Hui, Yaxin Liu, Ziming Li, and Songlin Hu. 2023. TrojanSQL: SQL Injection against Natural Language Interface to Database. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 4344–4359, Singapore. Association for Computational Linguistics.
Cite (Informal):: TrojanSQL: SQL Injection against Natural Language Interface to Database (Zhang et al., EMNLP 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.emnlp-main.264.pdf
Video:: https://aclanthology.org/2023.emnlp-main.264.mp4

PDF Cite Search Video Fix data