A Neighbourhood-Aware Differential Privacy Mechanism for Static Word Embeddings

We propose a Neighbourhood-Aware Differential Privacy (NADP) mechanism considering the neighbourhood of a word in a pretrained static word embedding space to determine the minimal amount of noise required to guarantee a specified privacy level. We first construct a nearest neighbour graph over the words using their embeddings, and factorise it into a set of connected components (i.e. neighbourhoods). We then separately apply different levels of Gaussian noise to the words in each neighbourhood, determined by the set of words in that neighbourhood. Experiments show that our proposed NADP mechanism consistently outperforms multiple previously proposed DP mechanisms such as Laplacian, Gaussian, and Mahalanobis in multiple downstream tasks, while guaranteeing higher levels of privacy.


Introduction
Increasingly more NLP models have been trained on private data such as medical conversations, social media posts and personal emails (Abdalla et al., 2020;Lyu et al., 2020b;Song and Shmatikov, 2019).However, we must ensure that sensitive information related to user privacy is not leaked during any stage of the model training process.To protect user privacy, Differential Privacy (DP) mechanisms add random noise to the training data (Feyisetan and Kasiviswanathan, 2021;Krishna et al., 2021;Feyisetan et al., 2020).However, it remains a challenging task to balance the trade-off between user privacy vs. performance in downstream NLP tasks.
We propose Neighbourhood-Aware Differential Privacy (NADP) mechanism, which consists of three steps.First, given a set of words, we compute a nearest neighbour graph considering the similarity between the words (represented by In the sparse neighbourhood, NADP adds a higher level of perturbation noise z to the target word embedding x in order to protect its privacy by disguising it among its neighbours, while in a dense neighbourhood it adds less noise. the vertices of the nearest neighbour graph) computed using their word embeddings.Second, we compute the connected components in the nearest neighbour graph to find the neighbourhoods of words.Third, we apply Gaussian noise to all words in each neighbourhood, such that the variance of the noise is determined by the words in that neighbourhood.
As illustrated in Figure 1, if all words in a neighbourhood are highly similar to each other (i.e. a dense neighbourhood), it would require less perturbation noise to anonymise a word because the addition of small noise can easily hide the corresponding word embedding among its neighbours.On the other hand, if the words in a neighbourhood are not very similar to each other (i.e. a sparse neighbourhood) we must add higher levels of perturbation noise to a word embedding because its nearest neighbour would be further away in the embedding space.Because words in a language is a discrete set (unlike images for example), there does not exist a word corresponding to all points in the embedding space.Therefore, if we do not add sufficient amount of noise in a sparse neighbourhood, we run the risk of easily discovering the target word via a simple nearest neighbour search.Instead of adding the same level of noise to all words in a vocabulary as done in prior DP mechanisms, NADP attempts to minimise the total amount of noise by assigning low noise in dense neighbourhoods and high noise in sparse neighbourhoods.NADP has provable DP guarantees as shown by our theoretical analysis.Moreover, NADP has the following desirable properties that makes it attractive when used for NLP tasks.
(a) In NADP, noise vectors are sampled from the Gaussian distribution.Many static word embedding algorithms (Pennington et al., 2014;Arora et al., 2016;Mikolov et al., 2013) learn embeddings in the ℓ 2 space.Moreover, the squared ℓ 2 norm of a word embedding is known to positively correlate with the frequency of the word in the training corpus (Arora et al., 2016), while the joint co-occurrence probability of a set of words positively correlates with the squared ℓ 2 norm of the sum of the corresponding word embeddings (Bollegala et al., 2018).Therefore, it is natural to consider Gaussian noise, which corresponds to the ℓ 2 embedding space used by many static word embedding learning methods rather than the more widely-used Laplacian noise, which relates to the ℓ 1 norm.
(b) Unlike previously proposed DP mechanisms for word embeddings (Feyisetan et al., 2020;Feyisetan and Kasiviswanathan, 2021;Krishna et al., 2021;Xu et al., 2020), NADP dynamically adjusts the level of noise added to a word embedding considering its neighbourhood.This enables us to optimally allocate a fixed noise budget over a vocabulary.
(c) NADP adds noise directly to the word embeddings and does not perform decoding after the noise addition step (Krishna et al., 2021).Decoding is a deterministic process and does not affect DP.Many NLP applications such as text classification, clustering etc. require the input text to be represented in some vector space, and we can use the noise-added input text representations straightaway in such applications without requiring to first decode it back to text.In situations where users train word embeddings on private data on their own and only send/release the embeddings to external machine learning services, we only need to anonymise the word embeddings (Feyisetan and Kasiviswanathan, 2021).Results: Utility experiments ( § 5.1) conducted over four downstream NLP tasks show that NADP consistently outperforms previously proposed Laplacian, Gaussian and Mahalanobis mechanisms in downstream tasks.We conduct privacy experiments ( § 5.2) to evaluate the level of privacy guaranteed by a DP mechanism for word embeddings.Specifically, we estimate the probability of correctly predicting a word from its perturbed word embedding using the overlap between nearest neighbour sets.To evaluate the level of privacy protected for the entire set of word embeddings, we compute the skewness of the distribution of prediction probabilities.We find that NADP reports near-zero skewness values across a broad range of privacy levels, ϵ, which indicates significantly stronger privacy guarantees compared to other DP mechanisms.Source code implementation of our NADP is publicly available.1

Related Work
Learning models from data with DP guarantees has been studied under private learning (Kasiviswanathan et al., 2008).Abadi et al. (2016) proposed a DP stochastic gradient descent by adding Gaussian noise to the gradient of the loss function.Rogers et al. (2016) combined multiple DP algorithms using adaptive parameters.However, compared to continuous input spaces such as in computer vision (Zhu et al., 2020), DP mechanisms for the discrete inputs such as text remain understudied.Wang et al. (2021) proposed WordDP to achieve certified robustness against word substitution attacks in text classification.However, WordDP does not seek DP protection for the training data as we consider here, and uses DP randomness for certified robustness during inference time with respect to a testing input.Krishna et al. (2021) proposed AdePT, an autoencoder-based approach to generate differentially private text transformations.However, Habernal (2021) showed that AdePT is not differentially private as claimed and proved weaker privacy guarantees.DPText (Alnasser et al., 2021;Beigi et al., 2019) uses an autoencoder to obtain a text representation and adds Laplacian noise to create private representations.However, Habernal (2022) proved that the use of reparametrisation trick for the inverse continuous density function in DPText is inaccurate and that DPText violates the DP guarantees.Such prior attempts show the difficulty in developing theoretically correct DP mechanisms for NLP.Lyu et al. (2020a) proposed DP Neural Representation (DPNR) to preserve the privacy of text representations by first randomly masking words from the input texts and then adding Laplacian noise.However, unlike NADP DPNR uses a neighbourhood insensitive fixed Laplacian noise distribution.Feyisetan et al. (2020) proposed a DP mechanism where they first add Laplacian noise to word embeddings and then return the nearest neighbour to the noise-added embedding as the output.However, the ℓ 2 norm of the noise vector scales almost linearly with the dimensionality of the embedding space.To address this issue, in their subsequent work (Feyisetan and Kasiviswanathan, 2021), they projected the word embeddings to a lower-dimensional space before adding Laplacian noise.Xu et al. (2020) proposed Mahalanobis DP mechanism, which adds elliptical noise considering the covariance structure in the embedding space.Unlike the Gaussian or Laplacian mechanisms, Mahalanobis mechanism adds heterogeneous noise along different directions such that words in sparse regions in the embedding space have sufficient likelihood of replacement without sacrificing the overall utility.They show that Mahalanobis mechanism to be superior to Laplacian mechanism.Mahalanobis mechanism is a special instance of metric (Lipschitz) DP originated in privacy-preserving geolocation studies (Andrés et al., 2013), where Euclidean distance was used as the distance metric.Although metric DP considers the distance between two data points, it does not consider all of the nearest neighbours for each data point when deciding the level of noise that must be applied to a particular data point, unlike our NADP mechanism.Li et al. (2018) used adversarial learning to build NLP models such as part-of-speech (PoS) taggers that cannot predict the writer's age or sex, while can accurately predict the PoS tags.Despite their empirical success, this approach does not have any formal DP guarantees.In contrast, our focus is provably DP mechanisms with formal guarantees.
All of the prior work described thus far, except DPNR and AdePT, focus on static word embeddings as we do in this paper.A natural future ex-tension of this work is DP mechanisms for the contextualised embeddings.However, computational and practical properties of static word embeddings such as, being lightweight to both compute and store, are attractive for resource (e.g.GPU and RAM) limited mobile devices.Considering that such personal mobile devices are used by billions of users and contain highly private data, DP mechanisms for static word embeddings remains an important research topic.Moreover, Gupta and Jaggi (2021) showed that it is possible to distil static word embeddings from pretrained language models that have comparable performance to contextualised word embeddings.

DP for Word Embeddings
Let us denote the d-dimensional embedding of a word x in a vocabulary X by a vector x ∈ R d .We can consider a word embedding algorithm as a function f : X → R d that maps the words in a discrete vocabulary space X to a d-dimensional continuous space R d .We can use a distance metric, Γ, defined in the embedding space to measure the distance Γ(x i , x j ) between two words x i and x j such as the Euclidean distance.We can then find the set of top-m nearest neighbours, S m (x), from X using Γ such that for any y ∈ S m (x) and y ′ / ∈ S m (x), Γ(x, y) ≤ Γ(x, y ′ ) holds.The Jaccard similarity, Jaccard(x, y), between two words x and y is defined using their neighbourhoods as in (1).
We define two words x, y ∈ X to be in a symmetric neighbouring relation, x ≃ y, if the following two conditions are jointly satisfied: One could use conjunction instead of disjuction in condition (a) to enforce a mutual nearest neighbour relation.However, doing so results in a large number of small isolated neighbourhoods because two words might not be mutual nearest neighbours unless they are synonyms (or highly related).Relaxing the condition (a) to a disjunction would form neighbourhoods where one word might be a neighbour of another but not the inverse such as in hypernym-hyponym pairs.For example, colour could be a top nearest neighbour of crimson, but crimson might not be a top nearest neighbour of colour, because there are other prototypical colours such as red, green, blue, etc. than crimson.
Let us formally define DP for word embeddings.Because each word is assigned a vector by the word embedding learning algorithm, we can add noise to the embedding vectors to disguise a word among its nearest neighbours in the embedding space.However, in doing so we will be perturbing the semantics in the embeddings, thus potentially hurting downstream task performance.Therefore, there exists a trade-off between the amount of privacy that can be guaranteed by adding random noise to the embeddings vs. the performance of a downstream NLP task that use those embeddings.A random mechanism operating on word embeddings is said to be DP if Definition 1 holds.
Definition 1 (Differential Privacy).A random mechanism M that takes in a vector in the embedding space X and maps into a space Y (i.e. if for every pair of neighbouring inputs x, x ′ ∈ X and every possible measurable output set T ∈ Y the relationship given by (2) holds: Here, ϵ represents the level of privacy ensured by M and smaller ϵ values result in stronger privacy guarantees.The global ℓ 2 sensitivity of the embedding space is defined as Given a set of word embeddings, ∆ can be estimated empirically by computing the maximum Euclidean distance between a word x and its most distant neighbour x ′ in S m (x).As an extreme case, let us consider the smallest possible neighbourhood size corresponding to m = 2. Estimating ∆ in this case would amount to finding the maximum Euclidean distance between any pair of neighboring words x, x ′ ∈ V.Moreover, the ∆ estimated for m = 2 will be larger than the ∆ estimated for any other m(> 2) neighbourhood sizes.Therefore, ∆ is independent of m and can be estimated via a deterministic process (i.e.measuring all pairwise Euclidean distances) from a given set of word embeddings.

Gaussian Mechanism
Gaussian mechanism uses ℓ 2 norm for estimating the sensitivity due to perturbation and is a more natural fit for word embeddings than, for example, the Laplace mechanism, which is associated with the ℓ 1 norm.Therefore, we use the Gaussian mechanism as the basis for our proposal.
Let us consider a multivariate zero-mean isotropic Gaussian noise distribution, N (0, σ2 I d ), where I d is the unit matrix in the d-dimensional real space and σ is the standard deviation.For each word, x ∈ X , we sample a random vector z ∼ N (0, σI d ) and create a noise-added embedding M g (x) for x as given by (3).
The proof of Theorem 1 can be found in the Appendix A in (Dwork and Roth, 2014).
4 Neighbourhood-Aware Differential Privacy (NADP) three main steps.First, we create a nearest neighbour graph where vertices represent the words as described in § 4.1.Next, we factorise this nearest neighbour graph into a set of mutually exclusive neighbourhoods by finding its connected components as described in § 4.2.Finally, for the words that belong to each connected component, we add random noise sampled from Gaussian distributions with zero mean and different standard deviations, determined according to the neighbourhood associated with the corresponding connected component.We prove that the proposed NADP mechanism is DP in § 4.3.

Nearest Neighbour Graph Construction
To represent the nearest neighbours of a set X of words, we construct a nearest neighbour graph, Given the one-toone mapping between words and the vertices in the graph, for notational simplicity we denote the i-th vertex of the graph by x i (∈ X ).Two vertices x i and x j are connected by an edge e ij (∈ E), if x i ≃ x j holds between the corresponding words x i and x j .As already explained in § 3, we define two words x i , x j ∈ X to be in a symmetric neighbouring relation, x i ≃ x j if the following two conditions are jointly satisfied: (a) x i ∈ S m (x j ) or x j ∈ S m (x i ), and (b) Jaccard(x i , x j ) ≥ τ for a predefined similarity threshold τ ∈ [0, 1].
The pseudo code for constructing the nearest neighbour graph is shown in Algorithm 1.In our experiments, we set m = 2, which considers only the top-2 neighbours (i.e. S 2 ) to ensure only the highly similar neighbours are connected by edges in the nearest neighbour graph.τ can be used to remove neighbours that have less similarity to a target word across the graph.For example, by setting τ = 0.8, we can ensure that no two words with neighbourhood similarity (measured using the Jaccard coefficient) less than 0.8 will be connected by an edge in G.We empirically study the effect of varying τ on NADP later in our experiments.

Finding Connected Components
Once a nearest neighbour graph G is constructed for X , next we identify the regions of neighbours, which we refer to as the neighbourhoods.To consider tightly connected neighbourhoods, we propose to factorise G into a set of mutually exclu-Algorithm 2: Finding Connected Components sive connected components following the procedure described in Algorithm 2. We start by randomly selecting a word x from X and creating a neighbourhood X 1 consisting all of x's neighbours.We then remove the words in X 1 from X , and repeat this process until all words in X are included in some neighbourhood.The procedure described in Algorithm 2 for obtaining connected components from G is simple, efficient and obtains good DP performance in our experiments.Moreover, it does not require the number of neighbourhoods, k, to be specified in advance as it would be the case for many clustering-based approaches for graph partitioning such as spectral clustering (von Luxburg, 2007).There is a possibility of obtaining long chains when computing connected components using Algorithm 2. However, we did not encounter this issue in our experiments.This is because the nearest neighbour relation that is defined in § 3 requires both mutual nearest neighbourhood and high Jaccard similarity to be satisfied, which reduces the likelihood of forming long chains.Exploring alternative methods for factorising a given graph into a set of mutually exclusive connected components is deferred to future work.

Perturbation of Word Embeddings
In this section, we will first prove that NADP satisfies the DP conditions, and then present an al-gorithm that can be used to add perturbation noise to the words in each neighbourhood.First note that the trivial relation x ≃ x implies the set {||x − y|| | x ≃ y} is nonempty and hence we can consider the global L 2 sensitivity, ∆ = sup x≃y ||x − y||, for any two neighbouring words x and y in the given set of words X .Balle and Wang (2018) proved Theorem 2 that shows a set of word embeddings can be made differentially private by adding Gaussian noise sampled according to ∆, where Φ(t) the Cumulative Density Function (CDF) of the standard univariate Gaussian distribution, given by (4).
Theorem 2 (Balle and Wang ( 2018)).Let f : X → R d be a function with global L 2 sensitivity ∆ > 0. For any ε ≥ 0 and δ ∈ [0, 1], the Gaussian output perturbation mechanism The original proof of Theorem 2 is provided in (Balle and Wang, 2018).However, in § C.1 we provide an alternative proof, which is more concise and can be directly extended to the case of multiple neighbourhoods represented by the connected components in the nearest neighbour graph.
Theorem 3 (see § C.2 for proof) states that NADP satisfies DP conditions.Theorem 3 (main).Let {X 1 , • • • , X k } be the connected components of the graph G(X , ≃) and let σ i (1 ≤ i ≤ k) be non-negative real numbers such that σ i > 0 whenever for any x ∈ X satisfying ∆ i(x) > 0.
Remark 1.We have ∆ i(x) = 0 iff the connected component X i(x) consists of only one word x.
Theorem 3 guarantees that the NADP mechanism described in Algorithm 3 for perturbing a set Algorithm 3: Neighbourhood-Aware Differential Privacy 13 Return X of word embeddings satisfies DP.Specifically, we can first compute u * globally for all neighbourhoods (Line 4) as the minimiser of g(u) (given by ( 7)) such that the DP-condition in ( 5) is satisfied.We can then determine the standard deviation, σ i , corresponding to each neighbourhood, using u * and the local sensitivity, ∆ i , computed from that neighbourhood.Finally, we sample noise vectors from N (0, σ 2 i I d ) and add to all word embeddings in each X i .

Experiments
We use the pretrained3 300-dimensional GloVe embeddings (Pennington et al., 2014) for 2.8M words, which have also been used in much prior work (Xu et al., 2020;Feyisetan and Kasiviswanathan, 2021) as the static word embeddings.
We build a nearest neighbour graph using the top-1000 frequent words in the English Wikipedia, which resulted in a 73,404 vertex graph.It takes less than 5 minutes to find all connected components of a graph containing 73,404 words used in the paper.Moreover, this is a task independent pre-processing step.Building the neighbourhood graph in a brute force manner requires 3.5  hours, while approximate nearest neighbour methods such as SCANN (Guo et al., 2020) an be used to do the same in less than 1 minute with over 95% recall.
In our experiments, we compare NADP against the following DP mechanisms: Gaussian mechanism described in § 3.1, Laplacian mechanism, where noise vectors are sampled from the Laplace distribution with zero location parameter and with different values of the ϵ scale parameter, Mahalanobis mechanism with the recommended parameter values by Xu et al. (2020) (i.e. the Mahalanobis norm λ = 1 and ϵ ∈ (0, 40] are used), which is the current SoTA DP mechanism for static word embeddings.All of the above mentioned DP-mechanisms apply the same level of random noise to all word embeddings.Therefore, to understand the importance of assigning different levels of noise to different words, we consider a baseline DP mechanism, which we call the Jaccard mechanism.We define the density, η(x), of the neighbourhood, S k (x), of a word x as the average Euclidean distance between x and its nearest neighbours (i.e. Next, we categorise words into two density categories: dense (X 1 = {x|x ∈ X , η(x) < η 0 }) vs. sparse (X 2 = {x|x ∈ X , η(x) ≥ η 0 }), based on a density threshold η 0 .Our preliminary experiments showed that splitting into more than two categories did not result in significant performance gains.For a word x ∈ X i , we sample a random vector n(x) ∼ N (0, σ i I d ), for i ∈ {1, 2} and add to x. Jaccard is a DP mechanism (see § C.3 for the proof).Note that the density threshold is used only by the Jaccard mechanism and is not required by NADP.It is determined automatically such that we get approximately similar numbers of words in the dense and sparse sets.

Utility Experiments
To evaluate the semantic information preserved in word embeddings, we use the following standard tasks that have been used in much prior work for this purpose (Bollegala, 2022;Bollegala and O'Neill, 2022;Tsvetkov et al., 2015;Faruqui et al., 2015): word similarity measurement, semantic textual similarity (STS), Text Classification, Odd-man-out (Stanovsky and Hopkins, 2018).Due to space limitations, we detail the tasks, datasets and evaluation metrics in Appendix A. Results: Figure 2 shows the performance obtained on utility experiments with noise-added word embeddings for different values of the privacy parameter ϵ, where we use τ = 0.5.The total set of words used in the datasets for all utility experiments is n = 73404.Therefore, we set δ = 1/73404 ≈ 0.000013623 in all experiments reported in the paper.We repeat each experiment five times and plot the mean and the standard error.Recall that smaller ϵ values provide stronger DP guarantees.From Figure 2, we see that NADP reports the best performance on all four tasks among the methods compared across all ϵ values.Among the other methods, Mahalanobis performs second best to NADP in word-pair similarity prediction, text classification and odd-man-out, but performs worst in STS.In word-pair similarity prediction, text classification and odd-man-out tasks, we see the performance of NADP as well as the other methods increase with ϵ due to less noise being added to the word embeddings.
The performance in STS is comparatively less affected by ϵ because it is a sentence-level comparison task, which considers all perturbed word embeddings in a sentence, whereas the other three are word-level tasks.We see that Jaccard and Gaussian mechanisms perform similarly in all tasks.This is not surprising given that the Jaccard mechanism is drawing the noise vectors from two independent Gaussian distributions.In particular for high ϵ values, we see that Gaussian outperforms Laplacian in word-pair similarity prediction, text classification and odd-man-out tasks.This re- sult implies that for making word embeddings differentially private, the L 2 sensitivity considered in the Gaussian mechanism is more appropriate than the L 1 sensitivity considered in the Laplacian mechanism.

Privacy Experiments
To empirically measure the level of privacy protected by a DP mechanism, we consider, p(x|M (x)), the probability of predicting the word x using its noise-added embedding M (x), as a metric of privacy provided by a DP mechanism.However, it is difficult to accurately estimate probability densities in discrete spaces due to data sparseness.Therefore, we approximate p(x|M (x)) by |Sm(x)∩Sm(M (x))| |Sm(x)∪Sm(M (x))| , using the nearest neighbour sets.It is noteworthy that this is a conservative estimate of p(x|M (x)) because, even if all of the nearest neighbours of x and M (x) fully overlap , there will still be a 1/m uncertainty ensuring a nonzero level of privacy.
Due to the differences in neighbourhood densities, some words are likely to be influenced more than the others by a DP mechanism.From a DP point of view we are interested in protecting the privacy of all words in the vocabulary and not just for a subset of it.Therefore, to empirically quantify the global effect on privacy of a DP mechanism, we compute the skewness of the distribution of the estimated p(x|M (x)) values.x i has lower p i = p(x i |M (x i )) values, the probability mass of the p i distribution will be shifted to the left of the mean, resulting in smaller skewness values (see Appendix B for further explanations).Therefore, smaller skewness values indicate that most words are protected (the probability of being discovered is smaller than the mean) under a DP mechanism.
Results: Figure 3 shows the skewness values reported by Jaccard, Mahalanobis, Gaussian, Laplacian mechanisms and the proposed NADP (for different τ ) mechanism for different ϵ values.Overall, we see that NADP reports the lowest skewness values among all DP mechanism compared, indicating that it protects privacy of word embeddings well.We see that the skewness values slightly increase with τ .Recall that when τ increases the similarity of the neighbours connected to a target word by the symmetric neighbouring relation, ≃, increases in the nearest neighbour graph G. Therefore, when τ is high, unless when we apply stronger random noise to word embeddings, it becomes easier to discover the original word from its noise-added embedding.However, we note that the performance of NADP is relatively unaffected by different τ values and skewness values are low for τ = 0.1 setting, which we use in the utility experiments described in § 5.1.Although Gaussian, Jaccard and Mahalanobis mechanisms obtain comparable levels of skewness values when ϵ > 15, for ϵ < 5, where stronger privacy guarantee is required, NADP is the only DP mechanism with near-zero skewness values.

Investigating the Nearest Neighbours
To obtain qualitative insights into the levels of privacy provided by NADP, for a given word, we compare its top-3 neighbours in the original embedding space (no-noise added), when Mahalanobis and NADP mechanisms are used to add random noise.Table 1 shows the results for some randomly selected set of words.We see that for the words such as police, hitler, wikileaks and fbi, even after applying the Mahalanobis mechanism (λ = 1), we still retrieve the original word as a nearest neighbour.This indicates that Mahalanobis mechanism is unable to anonymise the target words in these cases.Although not reported here due to space limitations, this problem persists even in Jaccard, Gaussian and Laplace mechanisms, which were under performing to the Mahalanobis mechanism in utility and privacy experiments.In the case of misogynist, Mahalanobis mechanism retrieves highly similar neighbours such as sexist.On the other hand, the neighbours retrieved from the word embeddings anonymised using NADP are semantically less similar to the target word, thus could be considered to be better preserving the privacy of the target word.

Conclusion
We proposed NADP to make word embeddings indistinguishable from their nearest neighbours with theoretical DP guarantees.We compared NADP against existing DP mechanisms in multiple downstream utility experiments which showed its superior performance.Moreover, we evaluated the level of privacy protection provided by NADP against other DP mechanisms.We found NADP to provide stronger privacy guarantees over a broad range of ϵ values.In our future work, we plan to extend NADP to sentence/document embeddings and evaluate for languages other than English.

Ethical Considerations
We do not annotate or release any datasets as part of this research.However, the GloVe word embeddings that we use in our experiments are known to contain various types of unfair social biases such as gender and racial biases (Zhao et al., 2018;Kaneko andBollegala, 2019, 2021;Gonen and Goldberg, 2019).It is possible that these biases could get further amplified during the neighbourhood computation and noise-addition processes we perform in this work.Therefore, such social biases must be properly evaluated before the noise-added word embeddings produced by our proposed method are used in real-world NLP applications that are used by users.

Limitations
Our investigations in this paper was limited to GloVe embeddings, which is one the many avaulable pre-trained static word embeddings.There are other alternative word embeddings such as Skip-Gram with Negative Sampling (SGNS) (Mikolov et al., 2013), PMI-based word embeddings (Arora et al., 2016), fastText embeddings (Bojanowski et al., 2017) etc. that could be used in place of GloVe.However, contextualised word embeddings, obtained using pre-trained Masked Language Models (MLMs) such as BERT (Devlin et al., 2019), RoBERTa (Liu et al., 2019), AL-BERT (Lan et al., 2020), etc. have reported superior performance in various downstream tasks, surpassing that by static word embeddings.Therefore, we consider it to be a natural next step to extend our proposed method to anonymise contextualised word embeddings.The theoretical tools that we develop in this paper should be helpful in proving DP conditions for contextualised word embeddings as well.
All the downstream datasets and word embeddings we considered in this work are limited to the English language, which is known to be a morphologically limited language.Therefore, it is important to evaluate our proposed method on other languages using multilingual word embeddings to verify its effectiveness for the languages other than English.
tains 1379 test sentence-pairs and show the official score (i.e.class-weighted geometric mean of Spearman and Pearson correlation) in Figure 2b.
Text Classification: We train a binary classifier to predict the sentiment (positive vs. negative) of a short review text.Similar to the STS task, we represent a review using the centroid of the word embeddings of the words included in that review.We train a binary logistic regression model to predict sentiment in a review and in Figure 2c report the averaged classification accuracy on the balanced test sets in three standard datasets: Movie reviews dataset (Pang and Lee, 2005), customer reviews dataset (Hu and Liu, 2004) and opinion polarity dataset (Wiebe et al., 2005).
Odd-man-out: Stanovsky and Hopkins (Stanovsky and Hopkins, 2018) proposed the odd-man-out task, where given a set of five or more words, a system is required to choose the one which does not belong with the others.They annotated a dataset containing 843 sets via crowd sourcing.Pretrained word embeddings can be used to identify the odd-man in a set by repeatedly excluding one word at a time and measuring the average cosine similarity between all remaining pairs of words.Finally, the word when excluded resulting in the highest pairwise similarity is chosen as the odd-man.Unlike previously described tasks, odd-man-out can be carried out in an unsupervised manner, at word-level, and has higher human agreement between the annotators because it does not require numerical ratings.The percentage of correctly solved sets is shown in Figure 2d.

B Skewness and Privacy
Skewness is a measure of the asymmetry of p(x|M (x)) about its mean and can be positive, negative or zero depending on whether p(x|M (x)) has respectively a longer left tail, right tail, or perfectly symmetric around the mean (e.g. as in the case of the standard Normal distribution) (Joanes and Gill, 1998).Specifically, if we denote the probability of predicting i-th word x i by p i = p(x i |M (x i )), the skewness of the distribution of p i over n words is given by n , where pi and s are respectively the mean and standard deviation of {p i } n i=1 .We study the relationship between the level of privacy protected by the noise added using a particular DP mechanism, M and the skewness of the distribution of p(x|M (x)) for the words w in a vocabulary X .For this purpose, we use the Gaussian mechanism described in § 3.1 in the paper where we sample noise vectors z ∈ R d from the d-dimensional spherical Gaussian N (0, σI d ) with zero-mean and standard deviation σ, and add this noise to the word embedding, x ∈ R d , representing the word x.Specifically, M (x) = x+z.Next, we gradually increase σ ∈ [0, 1] in step size of 0.05 and compute the histograms of p(x|M (x)) values for the words in X .The histograms and their skewness values are shown in Figure 4.
From Figure 4, we see that when no-noise is being added (i.e.σ = 0), the histogram peaks at 1, indicating that all words can be trivially discovered from their word embeddings because the closest neighbour of any target word in the embedding space will be itself.Because the distribution is symmetric around this peak, we have a zero skewness.Overall, we see that when we add increasingly high noise, the histograms start shifting towards to the left because less words will be perfectly discovered from the noise added embeddings.Moreover, we see that more probability mass is distributed towards the right side of the mode (peak), resulting in a longer right tail.Consequently, we see skewness values also continuously increase (except at σ = 0.05, where the distribution has split into two parts) with σ.This trend stems from the definition of skewness and is independent of the DP mechanism used to generate noise.This result shows that when there are many words with smaller p(x|M (x)) values (i.e.distribution has a longer left tail), the skewness values will be smaller, indicating that the privacy is preserved for many words in X .
We will prove the following three statements:  To prove (a) Put Then, for any u > 0. Therefore, g(u) = (1/ √ 2π)h(u) is strictly decreasing on (0, ∞).The latter half of the statement is clear from the definition of g(u).Next, to prove (b) observe that since g(u) is a continuous function on (0, ∞) satisfying lim u→+0 g(u) = 1 and lim u→∞ g(u) = 0, g(u) = δ has a solution u ′ for any δ ∈ (0, 1), which must be unique and satisfy u ′ = u * because of the monotonicity of g(u).Finally, to prove (c) let σ = u * ∆.We then have which implies the mechanism M (x) = x + z with z ∼ N (0, σ 2 I d ) is (ε, δ)-DP as stated in Theorem 2.
C.2 Proof of Theorem 3 Proof.For any i (1 ≤ i ≤ k), let ≃ i be the symmetric neighbouring relation obtained by restricting the relation ≃ on X i .Then, ∆ i equals to the global L 2 sensitivity of (X i , ≃ i ) and ∆ i(x) = ∆ i , σ i(x) = σ i for any x ∈ X i .Hence, if ∆ i > 0, the mechanism M i obtained by restricting M on (X i , ≃ i ) is (ε, δ)-DP if and only if for any x ∈ X i by Theorem 2. Let x, x ′ ∈ X be words such that x ≃ x ′ and i (= i(x) = i(x ′ )) be the index such that x, x ′ ∈ X i , that is, x ≃ i x ′ .Now, suppose ∆ i > 0 and (6) holds for any x ∈ X i .Then, the mechanism M i is (ε, δ)-DP and hence, we have Proof.Suppose ∆ i(x) > 0.Then, by definition, we have which implies the mechanism M (x) = x + z with z ∼ N (0, σ 2 i(x) I d ) is (ε, δ)-DP as claimed in Theorem 3.
Theorem 4 (Jaccard mechanism is DP).Jaccard mechanism with σ i = ∆α i 2 log(1.25/δ)/ϵ is (ϵ, δ)-DP.Here, α i is a constant that depends only on the density category of a word and ∆ is the global sensitivity over the vocabulary.
Proof.Note that under the Jaccard mechanism, noise vectors, n(x), are sampled from either one of the two Gaussians N (0, σ 1 I d ) or N (0, σ 2 I d ) depending on respectively whether x ∈ X 1 or x ∈ X 2 .Moreover, because α i depends only on X i , from σ i = ∆α i 2 log(1.25/δ)/ϵand from Theorem 1 we see that each of these underlying Gaussian mechanisms are (ϵ, δ)-DP.Because X 1 ∩ X 2 = ∅ by their definitions, it follows from the compositionality property of DP that the overall Jaccard process is also (0, δ)-DP.This proof can be easily extended to more than two density categories by mathematical induction.
In our experiments, we use η 0 = 6.0 such that approximately equal numbers of words in X belong to each category, corresponding to α 1 = 1.835 and α 2 = 1.276 for m = 10.Global sensitivity ∆ is computed as the average Euclidean distance between a word and its furthermost neighbour.
The ability to guarantee the mean overlap between neighbourhoods before and after the noise addition is important from the point-of-view of NLP tasks that depend on the neighbourhood information such as semantic similarity measurement, bag-of-words representations-based information retrieval and word/text classification tasks, etc.Unlike in the Gaussian mechanism, in the Jaccard mechanism we have a direct relationship between the level of noise and the performance obtained using the anonymised embeddings in the downstream tasks.Moreover, the Jaccard mechanism allows us to set different noise levels to sparse vs. dense regions in the embedding space, which is not possible with other DP mechanisms.

Figure 1 :
Figure 1: Anonymizing a target word (shown in red) in a dense (left) vs. a sparse (right) neighbourhoods of words (shown in blue).In the sparse neighbourhood, NADP adds a higher level of perturbation noise z to the target word embedding x in order to protect its privacy by disguising it among its neighbours, while in a dense neighbourhood it adds less noise.

Figure 2 :
Figure 2: Performance on utility experiments ( § 5.1) shown in sub-figures (a)-(d).Accuracy and correlation (with human ratings) not decreasing with high privacy (ϵ) levels (corresponding to stronger noise levels by DP mechanisms) is desirable.Performance obtained without adding any noise is shown by the horizontal dotted lines.

Figure 3 :
Figure 3: Skewness values for predicting words using their noise-added embeddings.Low skewness values are desirable, and indicate that the prediction probability distribution is similar to the Normal distribution and is not skewed towards a subset of the words.

Figure 4 :
Figure 4: Histogram of p i values when zero-mean and σ standard deviation Gaussian noise is added to the word embeddings.Skewness values (skew) are shown in each histogram alongside with the σ.

Table 1 :
Top 3 neighbours for words without noise addition to the embeddings (no-noise), with SoTA Mahalanobis mechanism and the proposed NADP mechanism.Mahalanobis mechanism sometimes discloses the original word, whereas NADP mechanism never does.