Tracing Training Footprints: A Calibration Approach for Membership Inference Attacks Against Multimodal Large Language Models

Xiaofan Zheng, Huixuan Zhang, Xiaojun Wan


Abstract
With the increasing scale of training data for Multimodal Large Language Models (MLLMs) and the lack of data details, there is growing concern about privacy breaches and data security issues. Under black-box access, exploring effective Membership Inference Attacks (MIA) has garnered increasing attention. In real-world applications, where most samples are non-members, the issue of non-members being over-represented in the data manifold, leading to misclassification as member samples, becomes more prominent. This has motivated recent work to focus on developing effective difficulty calibration strategies, producing promising results. However, these methods only consider text-only input during calibration, and their effectiveness is diminished when migrated to MLLMs due to the presence of visual embeddings. To address the above problem, we propose PC-MMIA, focusing on visual instruction fine-tuning data. PC-MMIA is based on the idea that tokens located in poorly generalized local manifolds can better reflect traces of member samples that have been trained. By employing bidirectional perturbation of image embeddings to capture tokens critical to MIA and assigning them different weights, we achieve difficulty calibration. Experimental results demonstrate that our proposed method surpasses existing methods.
Anthology ID:
2025.findings-emnlp.931
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
17179–17191
Language:
URL:
https://aclanthology.org/2025.findings-emnlp.931/
DOI:
Bibkey:
Cite (ACL):
Xiaofan Zheng, Huixuan Zhang, and Xiaojun Wan. 2025. Tracing Training Footprints: A Calibration Approach for Membership Inference Attacks Against Multimodal Large Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 17179–17191, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Tracing Training Footprints: A Calibration Approach for Membership Inference Attacks Against Multimodal Large Language Models (Zheng et al., Findings 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.findings-emnlp.931.pdf
Checklist:
 2025.findings-emnlp.931.checklist.pdf