Timour Igamberdiev


2022

pdf bib
One size does not fit all: Investigating strategies for differentially-private learning across NLP tasks
Manuel Senge | Timour Igamberdiev | Ivan Habernal
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Preserving privacy in contemporary NLP models allows us to work with sensitive data, but unfortunately comes at a price. We know that stricter privacy guarantees in differentially-private stochastic gradient descent (DP-SGD) generally degrade model performance. However, previous research on the efficiency of DP-SGD in NLP is inconclusive or even counter-intuitive. In this short paper, we provide an extensive analysis of different privacy preserving strategies on seven downstream datasets in five different ‘typical’ NLP tasks with varying complexity using modern neural models based on BERT and XtremeDistil architectures. We show that unlike standard non-private approaches to solving NLP tasks, where bigger is usually better, privacy-preserving strategies do not exhibit a winning pattern, and each task and privacy regime requires a special treatment to achieve adequate performance.

pdf bib
DP-Rewrite: Towards Reproducibility and Transparency in Differentially Private Text Rewriting
Timour Igamberdiev | Thomas Arnold | Ivan Habernal
Proceedings of the 29th International Conference on Computational Linguistics

Text rewriting with differential privacy (DP) provides concrete theoretical guarantees for protecting the privacy of individuals in textual documents. In practice, existing systems may lack the means to validate their privacy-preserving claims, leading to problems of transparency and reproducibility. We introduce DP-Rewrite, an open-source framework for differentially private text rewriting which aims to solve these problems by being modular, extensible, and highly customizable. Our system incorporates a variety of downstream datasets, models, pre-training procedures, and evaluation metrics to provide a flexible way to lead and validate private text rewriting research. To demonstrate our software in practice, we provide a set of experiments as a case study on the ADePT DP text rewriting system, detecting a privacy leak in its pre-training approach. Our system is publicly available, and we hope that it will help the community to make DP text rewriting research more accessible and transparent.

pdf bib
Privacy-Preserving Graph Convolutional Networks for Text Classification
Timour Igamberdiev | Ivan Habernal
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Graph convolutional networks (GCNs) are a powerful architecture for representation learning on documents that naturally occur as graphs, e.g., citation or social networks. However, sensitive personal information, such as documents with people’s profiles or relationships as edges, are prone to privacy leaks, as the trained model might reveal the original input. Although differential privacy (DP) offers a well-founded privacy-preserving framework, GCNs pose theoretical and practical challenges due to their training specifics. We address these challenges by adapting differentially-private gradient-based training to GCNs and conduct experiments using two optimizers on five NLP datasets in two languages. We propose a simple yet efficient method based on random graph splits that not only improves the baseline privacy bounds by a factor of 2.7 while retaining competitive F1 scores, but also provides strong privacy guarantees of epsilon = 1.0. We show that, under certain modeling choices, privacy-preserving GCNs perform up to 90% of their non-private variants, while formally guaranteeing strong privacy measures.

2018

pdf bib
Metaphor Identification with Paragraph and Word Vectorization: An Attention-Based Neural Approach
Timour Igamberdiev | Hyopil Shin
Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation