Toward Privacy-preserving Text Embedding Similarity with Homomorphic Encryption

Donggyu Kim, Garam Lee, Sungwoo Oh


Abstract
Text embedding is an essential component to build efficient natural language applications based on text similarities such as search engines and chatbots. Certain industries like finance and healthcare demand strict privacy-preserving conditions that user’s data should not be exposed to any potential malicious users even including service providers. From a privacy standpoint, text embeddings seem impossible to be interpreted but there is still a privacy risk that they can be recovered to original texts through inversion attacks. To satisfy such privacy requirements, in this paper, we study a Homomorphic Encryption (HE) based text similarity inference. To validate our method, we perform extensive experiments on two vital text similarity tasks. Through text embedding inversion tests, we prove that the benchmark datasets are vulnerable to inversion attacks and another privacy preserving approach, dχ-privacy, a relaxed version of Local Differential Privacy method fails to prevent them. We show that our approach preserves the performance of models compared to that the baseline has degradation up to 10% of scores for the minimum security.
Anthology ID:
2022.finnlp-1.4
Volume:
Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates (Hybrid)
Editors:
Chung-Chi Chen, Hen-Hsen Huang, Hiroya Takamura, Hsin-Hsi Chen
Venue:
FinNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
25–36
Language:
URL:
https://aclanthology.org/2022.finnlp-1.4
DOI:
10.18653/v1/2022.finnlp-1.4
Bibkey:
Cite (ACL):
Donggyu Kim, Garam Lee, and Sungwoo Oh. 2022. Toward Privacy-preserving Text Embedding Similarity with Homomorphic Encryption. In Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP), pages 25–36, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
Cite (Informal):
Toward Privacy-preserving Text Embedding Similarity with Homomorphic Encryption (Kim et al., FinNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.finnlp-1.4.pdf
Video:
 https://aclanthology.org/2022.finnlp-1.4.mp4