How reparametrization trick broke differentially-private text representation learning

As privacy gains traction in the NLP community, researchers have started adopting various approaches to privacy-preserving methods. One of the favorite privacy frameworks, differential privacy (DP), is perhaps the most compelling thanks to its fundamental theoretical guarantees. Despite the apparent simplicity of the general concept of differential privacy, it seems non-trivial to get it right when applying it to NLP. In this short paper, we formally analyze several recent NLP papers proposing text representation learning using DPText (Beigi et al., 2019a,b; Alnasser et al., 2021; Beigi et al., 2021) and reveal their false claims of being differentially private. Furthermore, we also show a simple yet general empirical sanity check to determine whether a given implementation of a DP mechanism almost certainly violates the privacy loss guarantees. Our main goal is to raise awareness and help the community understand potential pitfalls of applying differential privacy to text representation learning.


Introduction
Differential privacy (DP), a formal mathematical treatment of privacy protection, is making its way to NLP Senge et al., 2021). Unlike other approaches to protect privacy of individuals' text documents, such as redacting named entities (Lison et al., 2021) or learning text representation with a GAN attacker (Li et al., 2018), DP has the advantage of quantifying and guaranteeing how much privacy can be lost in the worst case. However, as Habernal (2021) showed, adapting DP mechanisms to NLP properly is a non-trivial task.
Representation learning with protecting privacy in an end-to-end fashion has been recently proposed in DPText (Beigi et al., 2019b,a;Alnasser et al., 2021). DPText consists of an autoencoder for text representation, a differential-privacy-based noise adder, and private attribute discriminators, among others. The latent text representation is claimed to be differentially private and thus can be shared with data consumers for a given down-stream task. Unlike using a predetermined privacy budget ε, DPText takes ε as a learnable parameter and utilizes the reparametrization trick (Kingma and Welling, 2014) for random sampling. However, the downstream task results look too good to be true for such low ε values. We thus asked whether DPText is really differentially private.
This paper makes two important contributions to the community. First, we formally analyze the heart of DPText and prove that the employed reparametrization trick based on inverse continuous density function in DPText is wrong and the model violates the DP guarantees. This shows that extreme care should be taken when implementing DP algorithms in end-to-end differentiable deep neural networks. Second, we propose an empirical sanity check which simulates the actual privacy loss on a carefully crafted dataset and a reconstruction attack. This supports our theoretical analysis of non-privacy of DPText and also confirms previous findings of breaking privacy of another system ADePT. 1 2 Differential privacy primer Suppose we have a dataset (database) where each element belongs to an individual, for example Alice, Bob, Charlie, up to m. Each person's entry, denoted with a generic variable x, could be an arbitrary object, but for simplicity consider it a real valued vector x ∈ R k . An important premise is that this vector contains some sensitive information we aim to protect, for example an income (x ∈ R), a binary value whether or not the person has a certain disease (x ∈ {0.0, 1.0}), or a dense representation from SentenceBERT containing the person's latest medical record (x ∈ R k ). This dataset is held by someone we trust to protect the information, the trusted curator. 2 This dataset is a set from which we can create 2 m subsets, for instance X 1 = {Alice}, X 2 = {Alice, Bob}, etc. All these subsets form a universe X , that is X 1 , X 2 , · · · ∈ X , and each of them is also called (a bit ambiguously) a dataset. Definition 2.1. Any two datasets X, X ∈ X are called neighboring, if they differ in one person.
For example, X = {Alice}, X = {Bob} or X = {Alice, Bob}, X = {Bob} are neighboring, while X = {Alice}, X = {Alice, Bob, Charlie} are not. Definition 2.2. Numeric query is any function f applied to a dataset X and outputting a realvalued vector, formally f : X → R k .
For example, numeric queries might return an average income (f → R), number of persons in the database (f → R), or a textual summary of medical records of all persons in the database represented as a dense vector (f → R k ). The query is simply something we want to learn from the dataset. A query might be also an identity function that just 'copies' the input, e.g., f (X = {(1, 0)}) → (1, 0) for a real-valued dataset X = {(1, 0)}.
An attacker who knows everything about Bob, Charlie, and others would be able to reveal Alice's private information by querying the dataset and combining it with what they know already. Differentially private algorithm (or mechanism) M(X; f ) thus randomly modifies the query output in order to minimize and quantify such attacks. Smith and Ullman (2021) formulate the principle of differential privacy as follows: "No matter what they know ahead of time, an attacker seeing the output of a differentially private algorithm would draw (almost) the same conclusions about Alice whether or not her data were used." Let a DP-mechanism M(X; f ) have an arbitrary range R (a generalization of our case of numeric queries, for which we would have R = R k ). Differential privacy is then defined as for all neighboring datasets X, X and all z ∈ R, where Pr(X) and Pr(X ) is our prior knowledge of X and X . In words, our posterior knowledge of X or X after observing z can only grow by factor exp(ε) (Mironov, 2017), where ε is a privacy budget (Dwork and Roth, 2013). 3

Analysis of DPText
In the heart of the model, DPText relies on the standard Laplace mechanism which takes a realvalued vector and perturbs each element by a random draw from the Laplace distribution.
Formally, let z be a real-valued d-dimensional vector. Then the Laplace mechanism outputs a vectorz such that for each index i = 1, . . . , d where each s i is drawn independently from a Laplace distribution with zero mean and scale b that is proportional to the 1 sensitivity ∆ and the privacy budget ε, namely The Laplace mechanism satisfies differential privacy (Dwork and Roth, 2013).

Reparametrization trick and inverse CDF sampling
DPText employs the variational autoencoder architecture in order to directly optimize the amount of noise added in the latent layer parametrized by ε. In other words, the scale of the Laplace distribution becomes a trainable parameter of the network. As directly sampling from a distribution is known to be problematic for end-to-end differentiable deep networks, DPText borrows the reparametrization trick from Kingma and Welling (2014). In a nutshell, the reparametrization trick decouples drawing a random sample from a desired distribution (such as Exponential, Laplace, or Gaussian) into two steps: First draw a value from another distribution (such as Uniform), and then transform it using a particular function, mainly the inverse continuous density function (CDF).
As a matter of fact, sampling using the inverse CDF is a well-known and widely used method (Devroye, 1986;Ross, 2012) and forms the backbone of probability distribution generators in many popular frameworks.

Inverse CDF of Laplace distribution
The inverse cumulative distribution function of Laplace distribution Lap(µ; b) is: (4) where u ∼ Uni(0, 1) is drawn from a standard uniform distribution (Sugiyama, 2016, p. 210), (Nahmias and Olsen, 2015, p. 303). An equivalent expression without the sgn and absolute functions is derived, e.g., by Li et al. (2019, p. 166) as An alternative sampling strategy, as shown, e.g., by Al-Shuhail and Al-Dossary (2020, p. 62), assumes that the random variable is drawn from a shifted, zero-centered uniform distribution v ∼ Uni (−0.5, +0.5) and transformed through the following function While both (4) and (7) generate samples from Lap(µ; b), note the substantial difference between u and v, since each is drawn from a different uniform distribution.

Proofs of DPText violating DP
According to Eq. 3 in (Alnasser et al., 2021), Eq. 9 in (Beigi et al., 2019a) which is an extended version of (Beigi et al., 2019b), in Eq. 14 in , and personal communication to confirm the formulas, the main claim of DPText is as follows (rephrased): DPText utilizes the Laplace mechanism, which is DP (Dwork and Roth, 2013). It implements the mechanism as follows: Sampling a value from standard uniform v ∼ Uni(0, 1) and transforming using This claim is unfortunately false, as it mixes up both approaches introduced in Sec. 3.2. As a consequence, the Laplace mechanism using such sampling is not DP, which we will first prove formally.
Theorem 3.1. Sampling using inverse CDF as in DPText using (8) and (9) does not produce Laplace distribution.
Proof. We will rely on the standard proof of sampling from inverse CDF (see Appendix A). The essential step of that proof is that the CDF is increasing on the support of the uniform distribution, that is on [0, 1]. However, F −1 as used in (9) is increasing only on interval [0, 0.5]. For v ≥ 0.5, we get negative argument to ln which yields a complex function, whose real part is even decreasing. Therefore (9) is not CDF of any probability distribution, if used with Uni(0, 1).
As a consequence, the output ln(v ≤ 0) arbitrarily depends on the particular implementation. In numpy, it is NaN with a warning only. Therefore this function samples only positive or NaN numbers. Since DPText sources are not publicly available, we can only assume that NaN numbers are either replaced by zero, or the sampling proceeds as long as the desired number of samples is reached (discarding NaNs). In either case, no negative values can be obtained. See Fig. 2 in the Appendix for various Laplace-based distributions sampled with different techniques including possible distributions sampled in DPText.
Theorem 3.2. DPText with private mechanism based on (8) and (9) fails to guarantee differential privacy.
Proof. We rely on the standard proof of the Laplace mechanism as shown, e.g, by Habernal (2021). Let X = 0 and X = 1 be two neighboring datasets, and the query f being the identity query, such that it outputs simply the value of X. Let the DPText mechanism M(X; f ) outputs a particular value z.
In order to being differentially private, mechanism M(X; f ) has to fulfill the following bound of the privacy loss: for all neighboring datasets X, X ∈ X and all outputs z ∈ R from the range of M, provided that our priors over X and X are uniform (cf. Eq. 1). Fix z = 0.1. Then Pr(M(X) = 0.1) will have a positive probability (recall it takes the query output f (X = 0) = 0 and adds a random number drawn from the probability distribution, which is always positive as shown in Theorem 3.1.) However Pr(M(X ) = 0.1) will be zero, as the query output f (X = 1) = 1 will be added again only a positive random number and thus never be less then 1. By plugging this into (10), we obtain which results in an infinity privacy loss and violates differential privacy.

Empirical sanity check algorithm
It is impossible to empirically verify that a given DP-mechanism implementation is actually DP (Ding et al., 2018). However, it is possible to detect a DP-violating mechanism with a fair degree of certainty. We propose a general sanity check applicable to any real-valued DP mechanism, such as the Laplace mechanism, DPText, or any other. 5 We start by constructing two neighboring datasets X (Alice) and X (Bob) such that X = (0, . . . , 0 n ) consists of n zeros and X = (1, . . . , 1 n ) consists of n ones. The dimensionality n ∈ {1, 2, . . . } is a hyperparameter of the experiment. We employ a synthetic data release mechanism (also called local DP). The mechanism takes X or X and outputs its privatized version of the same dimensionality n, so that the zeros or ones are 'noisified' real numbers. The query sensitivity ∆ is n. 6 Thanks to the post-processing lemma, any postprocessing of DP output remains DP. We can thus turn the output real vector back to all zeros or all ones, simply by rounding to closest 0 or 1 and applying majority voting. This process is in fact our reconstruction attack: given a privatized vector, we try to guess what the original values were, either all zeros or all ones.
What our attacker is doing, and what DP protects, is that if Alice gives us her privatized data, we cannot tell whether her private values were all zeros or all ones (up to a given factor); the same for Bob.
By definition (1) and having no prior knowledge over X and X apart from the fact that the values are correlated, our attacker cannot exceed the guaranteed privacy loss exp(ε): We can estimate the conditional probability Pr(X|M(X; f ) = z) using maximum likelihood estimation (MLE) simply as our attacker's precision: How many times the attacker reconstructed true X values given the observed privatized vector. We can do the same for estimating the conditional probability of X . In particular, we repeatedly run each DP mechanism over X and X 10 million times each, which gives very precise MLE estimates even for small ε. 7

Results and discussion
For the sake of completeness, we implemented two extreme baselines: One that simply copies input (no privacy) and other one completely random regardless of the input (maximum privacy); these are shown in Figure 1 left. The vanilla Laplace mechanism behaves as expected; all empirical losses for all dimensions (1 up to 128) are bounded by ε. We re-implemented the Laplace mechanism from ADePT (Krishna et al., 2021) that, due to wrong sensitivity, has been shown theoretically as DP-violating (Habernal, 2021). We empirically confirm that ADePT suffered from the curse of dimensionality as the privacy loss explodes for larger dimensions. The last panel confirms our previous theoretical DPText results,  Figure 1: Area under the green line: Our attack does not reveal more than allowed by the desired privacy budget. Note that it does not guarantee DP, the reconstruction attack might be just weak. Area above the green line: The algorithm almost certainly violates DP as our attack caused bigger privacy loss than allowed by ε. Extreme baselines show two extreme scenarios, as random output is absolutely private (but provides zero utility) and copy input provides maximal utility but no privacy by revealing the data in full.
which (regardless of dimensionality) has infinite privacy loss. Note that we constructed the dataset carefully as two neighboring multidimensional correlated data that are as distant from each other as possible in the (0, 1) n space. However, DP must guarantee privacy for any datapoints, even the worst case scenario, as shown by the correct Laplace mechanism.

Conclusion
We formally proved that DPText (Beigi et al., 2019b,a;Alnasser et al., 2021; is not differentially private due to wrong sampling in its reparametrization trick. We also proposed an empirical sanity check that confirmed our findings and can help to reveal potential errors in DP mechanism implementations for NLP.

Ethics Statement
We declare no conflict of interests with the authors of DPText, we do not even know them personally. The purpose of this paper is strictly scientific.