Seeing the Forest through the Trees: Data Leakage from Partial Transformer Gradients

Weijun Li, Qiongkai Xu, Mark Dras


Abstract
Recent studies have shown that distributed machine learning is vulnerable to gradient inversion attacks, where private training data can be reconstructed by analyzing the gradients of the models shared in training. Previous attacks established that such reconstructions are possible using gradients from all parameters in the entire models. However, we hypothesize that most of the involved modules, or even their sub-modules, are at risk of training data leakage, and we validate such vulnerabilities in various intermediate layers of language models. Our extensive experiments reveal that gradients from a single Transformer layer, or even a single linear component with 0.54% parameters, are susceptible to training data leakage. Additionally, we show that applying differential privacy on gradients during training offers limited protection against the novel vulnerability of data disclosure.
Anthology ID:
2024.emnlp-main.275
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4786–4798
Language:
URL:
https://aclanthology.org/2024.emnlp-main.275
DOI:
Bibkey:
Cite (ACL):
Weijun Li, Qiongkai Xu, and Mark Dras. 2024. Seeing the Forest through the Trees: Data Leakage from Partial Transformer Gradients. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 4786–4798, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Seeing the Forest through the Trees: Data Leakage from Partial Transformer Gradients (Li et al., EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.275.pdf
Software:
 2024.emnlp-main.275.software.zip