QueryShield: A Platform to Mitigate Enterprise Data Leakage in Queries to External LLMs

Nitin Ramrakhiyani; Delton Myalil; Sachin Pawar; Manoj Apte; Rajan M A; Divyesh Saglani; Imtiyazuddin Shaik

doi:10.18653/v1/2025.naacl-industry.30

QueryShield: A Platform to Mitigate Enterprise Data Leakage in Queries to External LLMs

Nitin Ramrakhiyani, Delton Myalil, Sachin Pawar, Manoj Apte, Rajan M A, Divyesh Saglani, Imtiyazuddin Shaik

Abstract

Unrestricted access to external Large Language Models (LLM) based services like ChatGPT and Gemini can lead to potential data leakages, especially for large enterprises providing products and services to clients that require legal confidentiality guarantees. However, a blanket restriction on such services is not ideal as these LLMs boost employee productivity. Our goal is to build a solution that enables enterprise employees to query such external LLMs, without leaking confidential internal and client information. In this paper, we propose QueryShield - a platform that enterprises can use to interact with external LLMs without leaking data through queries. It detects if a query leaks data, and rephrases it to minimize data leakage while limiting the impact to its semantics. We construct a dataset of 1500 queries and manually annotate them for their sensitivity labels and their low sensitivity rephrased versions. We fine-tune a set of lightweight model candidates using this dataset and evaluate them using multiple metrics including one we propose specific to this problem.

Anthology ID:: 2025.naacl-industry.30
Volume:: Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 3: Industry Track)
Month:: April
Year:: 2025
Address:: Albuquerque, New Mexico
Editors:: Weizhu Chen, Yi Yang, Mohammad Kachuee, Xue-Yong Fu
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 358–369
Language:
URL:: https://aclanthology.org/2025.naacl-industry.30/
DOI:: 10.18653/v1/2025.naacl-industry.30
Bibkey:
Cite (ACL):: Nitin Ramrakhiyani, Delton Myalil, Sachin Pawar, Manoj Apte, Rajan M A, Divyesh Saglani, and Imtiyazuddin Shaik. 2025. QueryShield: A Platform to Mitigate Enterprise Data Leakage in Queries to External LLMs. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 3: Industry Track), pages 358–369, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):: QueryShield: A Platform to Mitigate Enterprise Data Leakage in Queries to External LLMs (Ramrakhiyani et al., NAACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.naacl-industry.30.pdf

PDF Cite Search Fix data