Adversarial Attack against Cross-lingual Knowledge Graph Alignment

Recent literatures have shown that knowledge graph (KG) learning models are highly vulnerable to adversarial attacks. However, there is still a paucity of vulnerability analyses of cross-lingual entity alignment under adversarial attacks. This paper proposes an adversarial attack model with two novel attack techniques to perturb the KG structure and degrade the quality of deep cross-lingual entity alignment. First, an entity density maximization method is employed to hide the attacked entities in dense regions in two KGs, such that the derived perturbations are unnoticeable. Second, an attack signal amplification method is developed to reduce the gradient vanishing issues in the process of adversarial attacks for further improving the attack effectiveness.


Introduction
Today, multilingual knowledge graphs (KGs), such as WordNet (Miller, 1992), DBpedia (Auer et al., 2007), YAGO (Hoffart et al., 2011), and Concept-Net (Speer et al., 2017), are becoming essential sources of knowledge for various AI-related applications, e.g., personal assistants, medical diagnosis, and online question answering. Cross-lingual entity alignment between multilingual KGs is a powerful tool that align the same entities in different monolingual KGs together, automatically synchronize different language-specific KGs and revolutionize the understanding of these ubiquitous multilingual KGs in a transformative manner Sun et al., 2020a;Berrendorf et al., 2021b,a).
Unfortunately, real-world KGs are typically noisy due to two main reasons: (1) massive fake information injected by malicious parties and users on online encyclopedia websites (e.g., Wikipedia (Wik) and Answers.com (Ans)), social networks (e.g., Twitter and Facebook), online communities (e.g., Reddit and Yahoo Answers), news websites, and search engines that usually serve as data sources of the KGs; and (2) direct adversarial attacks on the KGs. Google Knowledge Graph has been criticized for providing answers without source attribution or citation, and thus undermines people's ability to verify information and to develop well-informed opinions (Dewey, 2016).
Recent studies have shown that KG learning models remain highly sensitive to adversarial attacks, i.e., carefully designed small perturbations in KGs can cause the models to produce wrong prediction results, including knowledge graph embedding (Minervini et al., 2017;Pujara et al., 2017;Pezeshkpour et al., 2019;Banerjee et al., 2021) and knowledge graph-based dialogue generation . However, existing techniques focus on the adversarial attacks on single KG learning tasks. These techniques cannot be directly utilized to attack the cross-lingual entity alignment models, as they have to analyze relations within and across KGs. Two critical questions still keep unsolved: (1) Can small perturbations on KGs defeat cross-lingual entity alignment models? (2) How to design effective and unnoticeable perturbations against cross-lingual entity alignment?
The majority of cross-lingual entity alignment techniques aim to train the model by minimizing the distance between pre-aligned entity pairs in training data, such that the corresponding entity embeddings across KGs are close to each other, and the entity pairs with the smallest distance in test data are output as alignment results (Mao et al., 2020a;Mao et al., 2020b;Zhu et al., 2021;Mao et al., 2021;.
In terms of the distribution of entities in a KG, one idea of perturbing an entity unobtrusively is to move the entity to a dense region in the KG with many similar entities by adding/deleting relations to/from it is able to move it to a dense region in the KG with many similar entities, such that it is non-trivial to recognize the modified entity in the dense region with many similar entities.
Existing gradient-based adversarial attack methods (Goodfellow et al., 2015;Madry et al., 2018) search for the weakest input features to attack by calculating the loss gradient. However, the vanishing gradient problem is often encountered when training neural networks with poor backward signal propagation and thus leads to the attack failures (Athalye et al., 2018). Can we enhance the attack signal propagation for improving the attack effectiveness?
In this work, an entity density estimation and maximization method is employed to first estimate the distribution of entities in KGs. Based on the estimated KG distributions, the entities to be attacked are then moved to dense regions in two KGs by maximizing their densities. The attacked entities are hidden in dense regions in two KGs, such that they are surrounded by many neighbors in dense regions as well as indistinguishable from these neighbors. In addition, the surrounding of many neighbors makes it difficult to identify the correctly aligned entity pairs among many similar candidate entities.
We comprehensively study how poor signal propagation on neural networks leads to vanishing gradients in adversarial attacks over cross-lingual entity alignment. An attack signal amplification method is developed to secure informative attack signals with both well-conditioned Jacobian and competent signal propagation from the alignment loss. This reduces the gradient vanishing issues in the process of adversarial attacks for further improving the attack effectiveness.
Extensive experiments over real-world KG datasets validate the superior attack performance of the EAA model against several state-of-the-art cross-lingual entity alignment models. To our best knowledge, this work is the first to study adversarial attacks on cross-lingual entity alignment.

Problem Formulation
Given two input knowledge graphs G 1 and G 2 . Each is denoted as : 1 ≤ i, j ≤ N k , i = j} is the set of relations, and T k = E k × R k × E k is the set of triples. Each triple t k l = (e k i , r k ij , e k j ) ∈ T k in G K denotes head entity e k i connected to tail entity e k j through relation r k ij . A k is an N k × N k adjacency matrix that denotes the structure information of G K . By using knowledge graph embedding (KGE), each triple can be presented as (e k i , r k ij , e k j ), where boldfaced e k i , r k ij , and e k j represent the embedding vectors of head e k i , relation r k ij , and tail e k j respectively. D contains a set of pre-aligned entity pairs i ↔e 2 j indicates that two entities e 1 i and e 2 j are the equivalent ones in different language-specific KGs. The cross-lingual entity alignment aims to utilize D as the training data to identify the one-to-one entity alignments between entities e 1 i and e 2 j in two cross-lingual KGs G 1 and G 2 in the test data.
Most of existing cross-lingual entity alignment models are supervised learning methods with minimizing the distances (or maximizing the similarities) between the embeddings of pre-aligned entity pairs e 1 i and e 2 j in D Sun et al., 2020d;. The entity pairs e 1 i and e 2 j in the test data with the largest similarities are selected as the alignment results. The following loss function is minimized to learn a KGE model h : e k i ∈ E k → e k i . h is often implemented as a graph convolutional network (GCN) for deep KGE.
where (e 1 i , e 2 j ) and (e 1 i , e 2 j ) are positive and negative entity pairs. (e 1 i ) T is the transpose of e 1 i . σ(·) is the sigmoid function. The inner product · denotes the similarity between two embedding vectors.
Given a trained deep KGE model e k i = h(e k i ), an adversarial attacker aims to maximally degrade the alignment performance of h by injecting effective and unnoticeable relation perturbations (including relation addition and deletion) into two clean KGs G k (1 ≤ k ≤ 2), leading to two perturbed KGŝ where A k andÂ k are clean and perturbed adjacency matrices respectively. ∆ is the allowed attack budget, i.e., allowed relation modifications.

Unnoticeable Adversarial Attacks
Existing GCN-based entity alignment methods often initialize entity features with random initialization or pre-trained word embeddings of entity names and utilize adjacency matrix of KGs to learn the entity embeddings Sun et al., 2020d;. Thus, the embedding of an entity mainly depends on the embeddings of its neighbor entities. In order to modify the embedding of a target entity for the purpose of adversarial attacks, we need to remove some positive (i.e., existing) relations and add some negative (i.e., non-existing) relations between the target entity and its neighbors in adjacency matrix, and thus degrade the accuracy of entity embedding and alignment. We use the i th row of adjacency matrix A k (i.e., A k i ) to represent structure features of each entity e k i and analyze the impact of each structure feature (i.e., positive or negative relation) on the alignment accuracy.
As shown in Figure 1, assuming that e 1 i and e 2 j are pre-aligned entity embeddings, if we hide an entity e 1 i in a dense region with many similar e 1 k s by modifying its associated relations, then the surrounding of many e 1 k s makes it difficult to differentiate e 1 i from many similar e 1 k s and identify the correctly aligned entity pairs e 1 i and e 2 j among many similar candidate entities e 1 k s. In addition, if another pair of entity embeddings e 1 k and e 2 j are more similar than the pre-aligned entity embeddings e 1 i and e 2 j , i.e., (e 1 k ) T · e 2 j > (e 1 i ) T · e 2 j , then we will obtain an incorrect alignment result (e 1 k , e 2 j ). In this work, we will leverage our proposed kernel density estimation method (Zhang et al., 2020b) to estimate the distribution of perturbed KGs and maximize the distance between pre-aligned entity pairs for degrading the performance of entity alignment as well as for hiding the attacked entities in dense regions in two KGs. The kernel density estimation method is essentially to estimate a probability density function (PDF) f (x) of a random variable x for revealing the intrinsic distribution of x (Parzen, 1962). Let x k be a N k -dimensional random variable to denote the structure features of all where det(·) denotes the determinant operation. B > 0 is a bandwidth to be estimated. It is an which has strong influence on the density estimation f (x k ). A good B should be as small as the data can allow. K is a product symmetric kernel that satisfies K(x)dx = 1 and xK(x)dx = 0.
The vector form f (x k ) can be rewritten as an element form, where x k j denotes the j th dimension in We then calculate the derivative ∂f ( We make use of a greedy search method to determine bandwidths in the kernel density estimation method. For a non-trivial/trivial dimension j, updating the bandwidth b j will have a strong/weak influence over f (x k ). We greedily reduce b j with a sequence b 0 , b 0 s, b 0 s 2 , · · · for a parameter 0 < s < 1, until b j is smaller than a certain threshold τ j , to validate whether a small update in b j is able to lead to a large update in f (x k ).
We use an initial ∂b j is larger than a certain threshold.
We derive the corresponding variance According to the estimated bandwidth B by Algorithm 1, we can calculate density f (x k ) of x k in Eq.(3). The perturbation process is to maximize the following attack loss L A for producing unnoticeable perturbations, in terms of the estimations f (x 1 ) and f (x 2 ) in two KGs G 1 and G 2 .
denote perturbations of clean structure features A 1 i (and A 2 j ) in G 1 (and G 2 ) by adding a small amount of relation perturbations δ 1 i (and δ 2 j ), such thatê 1 i
is far away fromê 2 j and thus the alignment accuracy is decreased. In addition, we push e 1 i and e 2 j to dense regions to generateê 1 i andê 2 j , by maximizing f (Â 1 i ) and f (Â 2 j ), such thatê 1 i andê 2 j are indistinguishable from their neighbors in perturbed KGs. This reduces the possibility of perturbation detection by humans or defender programs.
where (A 1 i ) (t+1) and (A 2 j ) (t+1) denotes the perturbations of A 1 i and A 2 j derived at step t. specifies the budget of allowed perturbed relations for each attacked entity.
represents the constraint set of the projection operator Π, i.e., it encodes whether a relation in A 1 i is modified or not. The composition of the ReLU and sign operators guarantees , 1} N 2 , as it adds (or removes) an relation or keeps it unchanged when an derivate in the gradient is positive (or negative). The outputs (A 1 i ) T and (A 2 j ) T at final step T are used as the perturbed adjacency matricesÂ 1 i andÂ 2 j .

Effective Adversarial Attacks
Unfortunately, the above PGD-based unnoticeable attack method needs to iteratively calculate the gradient ∇ (A 1 i ) L A , which mainly depends on It is obvious that the gradient is determined with both the signal and the Jacobian together. The situation that either the signal has saturating gradient or the Jacobian is insignificant is able to result in vanishing gradients in and thus the attack failures.
All singular values of a neural network's inputoutput Jacobian matrix concentrate near 1 is a property known as dynamical isometry (Pennington et al., 2017). Ensuring the mean squared singular value of a network's input-output Jacobian is O(1) is essential for avoiding the exponential vanishing or explosion of gradients. We leverage the dynamical isometry theory for improving the effectiveness of the PGD adversarial attacks. Concretely, a neural network is dynamical isometry if all singular values λ ir of the Jacobian J i are close to 1, i.e., 1 − λ ir ≤ ξ for ∀r, r ∈ {1, · · · , min{N 1 , N 2 }} and a small positive number ξ ≈ 0. In our problem, when the Jacobian matrix J i is dynamical isometry, the signal φ (e 1 i ) T , e 2 j backpropagates isometrically over the neural network and maintains the norm and all angles between vectors.
Intuitively, if we select a good attack signal amplification factor α to amplify e 1 i and e 2 j as follows, then this can improve the diffusion of attack signals. In addition, a good α should guarantee the relative order of the network's output logits invariant, to ensure the decision boundary of entity alignment unchanged.ẽ We rewrite the gradients with α as follows.
approaches zero and thus the vanishing gradient problem is encountered in adversarial attacks. In addition, all singular values of αJ i are equal to zeros if α = 0.
i is equal to zero, which leads to the vanishing gradient problem too.
Therefore, a desired α for avoiding the exponential vanishing of gradients should stand in between 0 and ∞, in order to guarantee the signal φ (ẽ 1 i ) T ,ẽ 2 j large enough, i.e., φ (ẽ 1 i ) T ,ẽ 2 j 2 > η for a positive threshold η, as well as make all singular values of αJ i close to 1, such that the signal φ (ẽ 1 i ) T ,ẽ 2 j can be well backpropagated from the output layer to the input layer.
In order to make the mean of singular values of αJ i close to 1, the first option of α is the inverse of the mean of singular values of J i .
where λ ir is the r th singular value of J i . |D| is the size of the set D of pre-aligned entity pairs and N = min{N 1 , N 2 }. For the purpose of ensuring φ (ẽ 1 i ) T ,ẽ 2 j 2 > η, the second option of α should be satisfied with 1 − σ((ẽ 1 i ) T ·ẽ 2 j ) > η/ ẽ 2 j 2 . The feasible α can be obtained through the following theorem. for ∀e 2 m ∈ E 2 . For a given 0 < η < d/2, if α <
Algorithm 2 combines the above two kinds of α to produce effective adversarial attacks with attack signal amplification. The perturbed entity embeddingsê 1 i andê 2 j are initialized with clean ones e 1 i and e 2 j in step 2. The first amplification factor α 1 is calculated in step 3. The second factor α 2 is computed in steps 5-7. α 1 and α 2 are integrated together for enhancing the attack signal propagation of neural networks in steps 8-9. The PGD attack method with attack signal amplification is utilized to perturb the KGs. The algorithm repeats the above iterative procedure until convergence.  We compare the EAA model with seven state-ofthe-art attack models. Sememe-based Word Substitution (SWS) incorporates the sememe-based word substitution and swarm optimization-based search to conduct word-level attacks (Zang et al., 2020). Inflection Word Swap (IWS) perturb the inflectional morphology of words to craft plausible and semantically similar adversarial examples Morris et al., 2020). We utilize the above two word-level attack models to replace associated entities of a relation based on semantics. GF-Attack attacks graph embedding methods by devising new loss and approximating the spectrum (Chang et al., 2020). LowBlow is a general low-rank adversarial attack model which is able to affect the performance of various graph learning tasks (Entezari et al., 2020). We use the above two graph attack models to directly add/remove relations in terms of graph topology. CRIAGE aims to add/remove the facts to/from the KG that degrades the performance of link prediction (Pezeshkpour et al., 2019). DPA contains a collection of data poisoning attack strategies against knowledge graph embedding . RL-RR uses reinforcement learning policy to produce deceptively perturbed KGs while keeping the downstream quality of the original KG (Raman et al., 2021). To our best knowledge, this work is the first to study adversarial attacks on cross-lingual entity alignment.

Experimental Evaluation
We evaluate four versions of EAA to show the strengths of different components. EAA-P uses the basic PGD (Madry et al., 2018) to produce adversarial attacks. EAA-D only utilizes the KDE and density maximization to generate effective and unnoticeable attacks. EAA-A employs only our attack signal amplification strategy to improve the performance of the basic PGD attack. EAA operates with the full support of both KDE and signal amplification components.
We validate the effectiveness of the above attack models with three representative cross-lingual entity alignment algorithms. AttrGNN integrates   both attribute and relation triples for better performance of cross-lingual entity alignment . RNM is a novel relation-aware neighborhood matching model for entity alignment (Zhu et al., 2021). To our best knowledge, REA is the only robust cross-lingual entity alignment solution against adversarial attacks by detecting noise in the perturbed inter-KG entity links .
We use two popular metrics in entity alignment to verify the attack effectiveness: Hits@k (i.e., the ratio of correctly aligned entities ranked in the top k candidates) and M RR (i.e., mean reciprocal rank). A smaller Hits@k or M RR indicates a worse entity alignment but a better attack. K is fixed to 1 in all tests.
Attack performance on various datasets with different entity alignment algorithms. Table 2-4 exhibit the Hits@1 and MRR scores of three GCN-based entity alignment algorithms on test data by nine attack models over three groups of cross-lingual datasets. Clean represents that the experiments run on the original KGs without any perturbations. For all other attack models, the number of perturbed relations is fixed to 5% in these experiments. It is observed that among nine attack methods, no matter how strong the attacks are, the EAA method achieve the lowest Hits@1 and MRR scores on perturbed KGs in most experi-   Figure 2 and 3 present the Hits@1 and MRR scores achieved by three entity alignment methods under adversarial attacks with four variants of our EAA attack model. We have observed the complete EAA achieves the lowest Hits@1 (< 0.681) and the smallest MRR scores (< 0.709) respectively, which are obviously better than other versions. Notice that EAA-A achieves the better attack performance than EAA-P in most tests. A reasonable explanation is that our attack signal amplification technique is able to alleviate the vanishing gradient issue, which effectively helps maintain the utility of adversarial attacks in GCN-based entity alignment models. In addition, EAA-D also performs well in most experiments, compared with EAA-P. A rational guess is that it is difficult to correctly match the entities in two KGs when they lie in dense regions with many similar entities. These results illustrate both KDE and signal amplification methods are important in producing effective and unnoticeable attacks in entity alignment.
Attack performance with varying perturbed relations. Figure 4 presents the performance of entity alignment under nine attack models by varying the ratios of perturbed edges from 5% to 30%. It is obvious that the attacking performance improves for each attacker with an increase in the number of perturbed edges. This phenomenon indicates that current GCN-based entity alignment methods are very sensitive to adversarial attacks. EAA achieves Especially, when the perturbation ratio is large than 10%, the Hits@1 values drop quickly. Impact of perturbation budget . Figure 5 (a) measures the performance effect of in the EAA model for the entity alignment by varying from 1 to 6. It is observed that when increasing , both Hits@1 and MRR scores of the EAA model decreases substantially. This demonstrates it is difficult to train a robust entity alignment model under large constraint. However, a large can be easily detected by humans or by defender programs. Notice that the average number of associated relations of each entity in three datasets is between 2.3 and 2.9. Thus we suggest generating both effective and unnoticeable attacks for the entity alignment task under between 2 and 3, such that is smaller than the average number of associated relations.
Impact of signal threshold η. Figure 5 (b) shows the impact of η in our EAA model over three groups of datasets. The performance curves initially drop when η increases. Intuitively, this can help alleviate the vanishing gradient issue in the PGD adversarial attacks. Later on, the performance curves keep relatively stable or even increasing when η continuously increases. A reasonable explanation is that the too large η makes the upper bound of α too small. This results in poor-conditioned Jacobian and thus leads to the vanishing gradient issue again. Thus, it is important to determine the optimal η for the EAA model.
Adversarial attacks on text and graph data. Recent studies have presented that NLP and graph models, especially DNN models, are highly sensitive to adversarial attacks, i.e., carefully designed small deliberate perturbations in input intended to result in analysis failures (Song et al., 2018;Xu et al., 2019a;Huq and Pervin, 2020).
Only recently, researchers have started to develop adversarial attack techniques to maximally degrade the performance of knowledge graph learning in knowledge graph embedding (Minervini et al., 2017;Pujara et al., 2017;Pezeshkpour et al., 2019;Banerjee et al., 2021) and knowledge graph-based dialogue generation . REA detects noise in the perturbed inter-graph links for robust cross-lingual entity alignment . RL-RR aims to produce deceptively perturbed knowledge graphs, which maintain the downstream performance of the original knowledge graph while significantly deviating from the original knowledge graph's semantics and structure (Raman et al., 2021).

Conclusions
We have studied the problem of adversarial attacks against cross-lingual entity alignment. First, we proposed to utilize kernel density estimation technique to estimate and maximize the densities of attacked entities and generate effective and unnoticeable perturbations, by pushing attacked entities to dense regions in two KGs. Second, we analyze how gradient vanishing causes failures of gradientbased adversarial attacks. We design an attack signal amplification method to ensure informative signal propagation. The EAA model achieves superior performance against representative attack models.

Ethical Considerations
In this work, all the three knowledge graph datasets are open-released by previous works for research . All the three datasets are widely used in training/evaluating the crosslingual entity alignment, for example, Zhu et al., 2021;Mao et al., 2021). All the three datasets are open-accessed resources that everyone can see and no privacy-related data (such as gender, nickname, birthday, etc.) are included. All the three knowledge graph datasets are originally collected and filtered from Wikipedia (under the license CC BY-SA 3.0). It is allowed to reuse them in research. But if it needs commercial use, it may need to ask for additional permission from the original author/copyright owner (Wik; . To summary, as research work, this work has no concerns on the dataset and other aspects. But if someone wants to use the same/similar data as us in commercial, they have to further check the licenses. Robin Jia and Percy Liang. 2017. Adversarial examples for evaluating reading comprehension systems. In Proceedings of the 2017 Conference on 1 c log d−η η , then 1−σ((ẽ 1 i ) T ·ẽ 2 j ) > η/ ẽ 2 j 2 for ∀e 2 j ∈ E 2 . Proof. 1−σ((ẽ 1 i ) T ·ẽ 2 j ) > η/ ẽ 2 j 2 is equivalent to σ((ẽ 1 i ) T ·ẽ 2 j ) < 1 − η/ ẽ 2 j 2 . We convert it to 1 1+exp −(ẽ 1 i ) T ·ẽ 2 j < 1 − η/ ẽ 2 j 2 . As (ẽ 1 i ) T ·ẽ 2 j ≤ c, we have 1 1+exp −α 2 (e 1 i ) T ·e 2 j ≤ 1 1+exp(−α 2 c) . If we can prove 1 1+exp(−α 2 c) < 1 − η/ ẽ 2 j , then we can testify 1 1+exp −α 2 (e 1 i ) T ·e 2 j < 1 − η/ ẽ 2 j . Thus, we need to solve exp α 2 c < ẽ 2 j 2 −η η .