Yanqi Song


2025

pdf bib
Discarding the Crutches: Adaptive Parameter-Efficient Expert Meta-Learning for Continual Semantic Parsing
Ruiheng Liu | Jinyu Zhang | Yanqi Song | Yu Zhang | Bailong Yang
Proceedings of the 31st International Conference on Computational Linguistics

Continual Semantic Parsing (CSP) enables parsers to generate SQL from natural language questions in task streams, using minimal annotated data to handle dynamically evolving databases in real-world scenarios. Previous works often rely on replaying historical data, which poses privacy concerns. Recently, replay-free continual learning methods based on Parameter-Efficient Tuning (PET) have gained widespread attention. However, they often rely on ideal settings and initial task data, sacrificing the model’s generalization ability, which limits their applicability in real-world scenarios. To address this, we propose a novel Adaptive PET eXpert meta-learning (APEX) approach for CSP. First, SQL syntax guides the LLM to assist experts in adaptively warming up, ensuring better model initialization. Then, a dynamically expanding expert pool stores knowledge and explores the relationship between experts and instances. Finally, a selection/fusion inference strategy based on sample historical visibility promotes expert collaboration. Experiments on two CSP benchmarks show that our method achieves superior performance without data replay or ideal settings, effectively handling cold start scenarios and generalizing to unseen tasks, even surpassing performance upper bounds.

2024

pdf bib
SecureSQL: Evaluating Data Leakage of Large Language Models as Natural Language Interfaces to Databases
Yanqi Song | Ruiheng Liu | Shu Chen | Qianhao Ren | Yu Zhang | Yongqi Yu
Findings of the Association for Computational Linguistics: EMNLP 2024

With the widespread application of Large Language Models (LLMs) in Natural Language Interfaces to Databases (NLIDBs), concerns about security issues in NLIDBs have been increasing gradually. However, research on sensitive data leakage in NLIDBs is relatively limited. Therefore, we propose a benchmark to assess the potential of language models to leak sensitive data when generating SQL queries. This benchmark covers 932 samples from 34 different domains, including medical, legal, financial, and political aspects. We evaluate 15 models from six LLM families, and the results show that the model with the best performance has an accuracy of 61.7%, whereas humans achieve an accuracy of 94%. Most models perform close to or even below the level of random selection. We also evaluate two common attack methods, namely prompt injection and inference attacks, as well as a defense method based on chain-of-thoughts (COT) prompting. Experimental results show that both attack methods significantly impact the model, while the defense method based on COT prompting dose not significantly improve accuracy, further highlighting the severity of sensitive data leakage issues in NLIDBs. We hope this research will draw more attention and further study from the researchers on this issue.