An Investigation towards Differentially Private Sequence Tagging in a Federated Framework

Abhik Jana; Chris Biemann

doi:10.18653/v1/2021.privatenlp-1.4

An Investigation towards Differentially Private Sequence Tagging in a Federated Framework

Abstract

To build machine learning-based applications for sensitive domains like medical, legal, etc. where the digitized text contains private information, anonymization of text is required for preserving privacy. Sequence tagging, e.g. as done in Named Entity Recognition (NER) can help to detect private information. However, to train sequence tagging models, a sufficient amount of labeled data are required but for privacy-sensitive domains, such labeled data also can not be shared directly. In this paper, we investigate the applicability of a privacy-preserving framework for sequence tagging tasks, specifically NER. Hence, we analyze a framework for the NER task, which incorporates two levels of privacy protection. Firstly, we deploy a federated learning (FL) framework where the labeled data are not shared with the centralized server as well as the peer clients. Secondly, we apply differential privacy (DP) while the models are being trained in each client instance. While both privacy measures are suitable for privacy-aware models, their combination results in unstable models. To our knowledge, this is the first study of its kind on privacy-aware sequence tagging models.

Anthology ID:: 2021.privatenlp-1.4
Volume:: Proceedings of the Third Workshop on Privacy in Natural Language Processing
Month:: June
Year:: 2021
Address:: Online
Editors:: Oluwaseyi Feyisetan, Sepideh Ghanavati, Shervin Malmasi, Patricia Thaine
Venue:: PrivateNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 30–35
Language:
URL:: https://aclanthology.org/2021.privatenlp-1.4
DOI:: 10.18653/v1/2021.privatenlp-1.4
Bibkey:
Cite (ACL):: Abhik Jana and Chris Biemann. 2021. An Investigation towards Differentially Private Sequence Tagging in a Federated Framework. In Proceedings of the Third Workshop on Privacy in Natural Language Processing, pages 30–35, Online. Association for Computational Linguistics.
Cite (Informal):: An Investigation towards Differentially Private Sequence Tagging in a Federated Framework (Jana & Biemann, PrivateNLP 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.privatenlp-1.4.pdf

PDF Cite Search