Privacy Checklist: Privacy Violation Detection Grounding on Contextual Integrity Theory

Haoran Li; Wei Fan; Yulin Chen; Cheng Jiayang; Tianshu Chu; Xuebing Zhou; Peizhao Hu; Yangqiu Song

doi:10.18653/v1/2025.naacl-long.86

Privacy Checklist: Privacy Violation Detection Grounding on Contextual Integrity Theory

Haoran Li, Wei Fan, Yulin Chen, Cheng Jiayang, Tianshu Chu, Xuebing Zhou, Peizhao Hu, Yangqiu Song

Abstract

Privacy research has attracted wide attention as individuals worry that their private data can be easily leaked during interactions with smart devices, social platforms, and AI applications. Existing works mostly consider privacy attacks and defenses on various sub-fields. Within each field, various privacy attacks and defenses are studied to address patterns of personally identifiable information (PII). In this paper, we argue that privacy is not solely about PII patterns. We ground on the Contextual Integrity (CI) theory which posits that people’s perceptions of privacy are highly correlated with the corresponding social context. Based on such an assumption, we formulate privacy as a reasoning problem rather than naive PII matching. We develop the first comprehensive checklist that covers social identities, private attributes, and existing privacy regulations. Unlike prior works on CI that either cover limited expert annotated norms or model incomplete social context, our proposed privacy checklist uses the whole Health Insurance Portability and Accountability Act of 1996 (HIPAA) as an example, to show that we can resort to large language models (LLMs) to completely cover the HIPAA’s regulations. Additionally, our checklist also gathers expert annotations across multiple ontologies to determine private information including but not limited to PII. We use our preliminary results on the HIPAA to shed light on future context-centric privacy research to cover more privacy regulations, social norms and standards. We will release the reproducible code and data.

Anthology ID:: 2025.naacl-long.86
Volume:: Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:: April
Year:: 2025
Address:: Albuquerque, New Mexico
Editors:: Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1748–1766
Language:
URL:: https://aclanthology.org/2025.naacl-long.86/
DOI:: 10.18653/v1/2025.naacl-long.86
Bibkey:
Cite (ACL):: Haoran Li, Wei Fan, Yulin Chen, Cheng Jiayang, Tianshu Chu, Xuebing Zhou, Peizhao Hu, and Yangqiu Song. 2025. Privacy Checklist: Privacy Violation Detection Grounding on Contextual Integrity Theory. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 1748–1766, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):: Privacy Checklist: Privacy Violation Detection Grounding on Contextual Integrity Theory (Li et al., NAACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.naacl-long.86.pdf

PDF Cite Search Fix data