Privacy-Aware Text Rewriting

Qiongkai Xu, Lizhen Qu, Chenchen Xu, Ran Cui


Abstract
Biased decisions made by automatic systems have led to growing concerns in research communities. Recent work from the NLP community focuses on building systems that make fair decisions based on text. Instead of relying on unknown decision systems or human decision-makers, we argue that a better way to protect data providers is to remove the trails of sensitive information before publishing the data. In light of this, we propose a new privacy-aware text rewriting task and explore two privacy-aware back-translation methods for the task, based on adversarial training and approximate fairness risk. Our extensive experiments on three real-world datasets with varying demographical attributes show that our methods are effective in obfuscating sensitive attributes. We have also observed that the fairness risk method retains better semantics and fluency, while the adversarial training method tends to leak less sensitive information.
Anthology ID:
W19-8633
Volume:
Proceedings of the 12th International Conference on Natural Language Generation
Month:
October–November
Year:
2019
Address:
Tokyo, Japan
Venues:
INLG | WS
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
247–257
Language:
URL:
https://aclanthology.org/W19-8633
DOI:
10.18653/v1/W19-8633
Bibkey:
Cite (ACL):
Qiongkai Xu, Lizhen Qu, Chenchen Xu, and Ran Cui. 2019. Privacy-Aware Text Rewriting. In Proceedings of the 12th International Conference on Natural Language Generation, pages 247–257, Tokyo, Japan. Association for Computational Linguistics.
Cite (Informal):
Privacy-Aware Text Rewriting (Xu et al., 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-8633.pdf
Supplementary attachment:
 W19-8633.Supplementary_Attachment.pdf