SQL Injection Jailbreak: A Structural Disaster of Large Language Models

Jiawei Zhao; Kejiang Chen; Weiming Zhang; Nenghai Yu

doi:10.18653/v1/2025.findings-acl.358

SQL Injection Jailbreak: A Structural Disaster of Large Language Models

Jiawei Zhao, Kejiang Chen, Weiming Zhang, Nenghai Yu

Abstract

Large Language Models (LLMs) are susceptible to jailbreak attacks that can induce them to generate harmful content.Previous jailbreak methods primarily exploited the internal properties or capabilities of LLMs, such as optimization-based jailbreak methods and methods that leveraged the model’s context-learning abilities. In this paper, we introduce a novel jailbreak method, SQL Injection Jailbreak (SIJ), which targets the external properties of LLMs, specifically, the way LLMs construct input prompts. By injecting jailbreak information into user prompts, SIJ successfully induces the model to output harmful content. For open-source models, SIJ achieves near 100% attack success rates on five well-known LLMs on the AdvBench and HEx-PHI, while incurring lower time costs compared to previous methods. For closed-source models, SIJ achieves an average attack success rate over 85% across five models in the GPT and Doubao series. Additionally, SIJ exposes a new vulnerability in LLMs that urgently requires mitigation. To address this, we propose a simple adaptive defense method called Self-Reminder-Key to counter SIJ and demonstrate its effectiveness through experimental results. Our code is available at https://github.com/weiyezhimeng/SQL-Injection-Jailbreak.

Anthology ID:: 2025.findings-acl.358
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 6871–6891
Language:
URL:: https://aclanthology.org/2025.findings-acl.358/
DOI:: 10.18653/v1/2025.findings-acl.358
Bibkey:
Cite (ACL):: Jiawei Zhao, Kejiang Chen, Weiming Zhang, and Nenghai Yu. 2025. SQL Injection Jailbreak: A Structural Disaster of Large Language Models. In Findings of the Association for Computational Linguistics: ACL 2025, pages 6871–6891, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: SQL Injection Jailbreak: A Structural Disaster of Large Language Models (Zhao et al., Findings 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.findings-acl.358.pdf

PDF Cite Search Fix data