Invernet: An Inversion Attack Framework to Infer Fine-Tuning Datasets through Word Embeddings

Ishrak Hayet; Zijun Yao; Bo Luo

doi:10.18653/v1/2022.findings-emnlp.368

Invernet: An Inversion Attack Framework to Infer Fine-Tuning Datasets through Word Embeddings

Abstract

Word embedding aims to learn the dense representation of words and has become a regular input preparation in many NLP tasks. Due to the data and computation intensive nature of learning embeddings from scratch, a more affordable way is to borrow the pretrained embedding available in public and fine-tune the embedding through a domain specific downstream dataset. A privacy concern can arise if a malicious owner of the pretrained embedding gets access to the fine-tuned embedding and tries to infer the critical information from the downstream datasets. In this study, we propose a novel embedding inversion framework called Invernet that materializes the privacy concern by inferring the context distribution in the downstream dataset, which can lead to key information breach. With extensive experimental studies on two real-world news datasets: Antonio Gulli’s News and New York Times, we validate the feasibility of proposed privacy attack and demonstrate the effectiveness of Invernet on inferring downstream datasets based on multiple word embedding methods.

Anthology ID:: 2022.findings-emnlp.368
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2022
Month:: December
Year:: 2022
Address:: Abu Dhabi, United Arab Emirates
Editors:: Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 5009–5018
Language:
URL:: https://aclanthology.org/2022.findings-emnlp.368/
DOI:: 10.18653/v1/2022.findings-emnlp.368
Bibkey:
Cite (ACL):: Ishrak Hayet, Zijun Yao, and Bo Luo. 2022. Invernet: An Inversion Attack Framework to Infer Fine-Tuning Datasets through Word Embeddings. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 5009–5018, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):: Invernet: An Inversion Attack Framework to Infer Fine-Tuning Datasets through Word Embeddings (Hayet et al., Findings 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.findings-emnlp.368.pdf
Dataset:: 2022.findings-emnlp.368.dataset.zip
Software:: 2022.findings-emnlp.368.software.zip
Video:: https://aclanthology.org/2022.findings-emnlp.368.mp4

PDF Cite Search Dataset Software Video Fix data