SEAD: A Surrogate-free Label-only Membership Inference Attack against Pre-trained LLMs with Semantic-Aware Density

Biao Yi; Jiahao Li; Yiming Li; Yu He; Baolei Zhang; Zheli Liu; Dacheng Tao

SEAD: A Surrogate-free Label-only Membership Inference Attack against Pre-trained LLMs with Semantic-Aware Density

Biao Yi, Jiahao Li, Yiming Li, Yu He, Baolei Zhang, Zheli Liu, Dacheng Tao

Abstract

Membership inference attacks (MIAs) aim to determine whether specific data was used to train a model. While existing MIAs against pre-trained Large Language Models (LLMs) typically require access to complete logits (probabilities), such access is sometimes unavailable in real-world deployments where only the generated text is exposed. Current label-only MIAs relied on surrogate models to estimate the target model’s token probabilities, but we identify fundamental limitations: high sensitivity to surrogate model selection and significant probability estimation errors. To address these challenges, we propose SEAD (Semantic-Aware Density), a novel surrogate-free label-only MIA approach that directly estimates token probabilities through Monte Carlo sampling of the target model itself. This approach eliminates dependency on surrogate models while reducing probability estimation errors by an order of magnitude. Furthermore, we introduce a semantic-aware density approach that enhances attack effectiveness by considering both exact token matches and semantically similar alternatives, inspired by the understanding that LLMs may express memorized information through different but semantically equivalent tokens. Extensive evaluations demonstrate that SEAD consistently outperforms existing label-only attacks and serves as a foundational density estimator in the label-only setting.

Anthology ID:: 2026.findings-acl.337
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 6794–6813
Language:
URL:: https://aclanthology.org/2026.findings-acl.337/
DOI:
Bibkey:
Cite (ACL):: Biao Yi, Jiahao Li, Yiming Li, Yu He, Baolei Zhang, Zheli Liu, and Dacheng Tao. 2026. SEAD: A Surrogate-free Label-only Membership Inference Attack against Pre-trained LLMs with Semantic-Aware Density. In Findings of the Association for Computational Linguistics: ACL 2026, pages 6794–6813, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: SEAD: A Surrogate-free Label-only Membership Inference Attack against Pre-trained LLMs with Semantic-Aware Density (Yi et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.337.pdf
Checklist:: 2026.findings-acl.337.checklist.pdf

PDF Cite Search Checklist Fix data