Preventing Author Profiling through Zero-Shot Multilingual Back-Translation

David Adelani, Miaoran Zhang, Xiaoyu Shen, Ali Davody, Thomas Kleinbauer, Dietrich Klakow


Abstract
Documents as short as a single sentence may inadvertently reveal sensitive information about their authors, including e.g. their gender or ethnicity. Style transfer is an effective way of transforming texts in order to remove any information that enables author profiling. However, for a number of current state-of-the-art approaches the improved privacy is accompanied by an undesirable drop in the down-stream utility of the transformed data. In this paper, we propose a simple, zero-shot way to effectively lower the risk of author profiling through multilingual back-translation using off-the-shelf translation models. We compare our models with five representative text style transfer models on three datasets across different domains. Results from both an automatic and a human evaluation show that our approach achieves the best overall performance while requiring no training data. We are able to lower the adversarial prediction of gender and race by up to 22% while retaining 95% of the original utility on downstream tasks.
Anthology ID:
2021.emnlp-main.684
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8687–8695
Language:
URL:
https://aclanthology.org/2021.emnlp-main.684
DOI:
10.18653/v1/2021.emnlp-main.684
Bibkey:
Cite (ACL):
David Adelani, Miaoran Zhang, Xiaoyu Shen, Ali Davody, Thomas Kleinbauer, and Dietrich Klakow. 2021. Preventing Author Profiling through Zero-Shot Multilingual Back-Translation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 8687–8695, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Preventing Author Profiling through Zero-Shot Multilingual Back-Translation (Adelani et al., EMNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.emnlp-main.684.pdf
Video:
 https://aclanthology.org/2021.emnlp-main.684.mp4
Code
 uds-lsv/author-profiling-prevention-bt
Data
CoLA