A Fused Gromov-Wasserstein Framework for Unsupervised Knowledge Graph Entity Alignment

Entity alignment is the task of identifying corresponding entities across different knowledge graphs (KGs). Although recent embedding-based entity alignment methods have shown significant advancements, they still struggle to fully utilize KG structural information. In this paper, we introduce FGWEA, an unsupervised entity alignment framework that leverages the Fused Gromov-Wasserstein (FGW) distance, allowing for a comprehensive comparison of entity semantics and KG structures within a joint optimization framework. To address the computational challenges associated with optimizing FGW, we devise a three-stage progressive optimization algorithm. It starts with a basic semantic embedding matching, proceeds to approximate cross-KG structural and relational similarity matching based on iterative updates of high-confidence entity links, and ultimately culminates in a global structural comparison between KGs. We perform extensive experiments on four entity alignment datasets covering 14 distinct KGs across five languages. Without any supervision or hyper-parameter tuning, FGWEA surpasses 21 competitive baselines, including cutting-edge supervised entity alignment methods. Our code is available at https://github.com/squareRoot3/FusedGW-Entity-Alignment.


Introduction
Knowledge Graph (KG) is one of structured data representations that characterizes real-world concepts (also known as entities) with their relationships and attributes.Recent years have witnessed the proliferation of KGs in various areas, ranging from the general ones such as DBpedia (Auer et al., 2007) and ConceptNet (Speer et al., 2017), to those in specific domains such as healthcare (Rotmensch et al., 2017), education (Chen et al., 2018), and e-commerce (Dong, 2018).As the information contained in each individual KG is limited and biased, entity alignment (EA) is proposed for linking equivalent entities across two KGs from different sources or languages, and integrating them into a new holistic-view KG.EA task has received a lot of attentions in the computational linguistics community, due to its ability to improve the completeness and fairness of KGs, and enhance a wide range of knowledge-driven downstream applications like question-answering (Saxena et al., 2020;Chen et al., 2021) and dialogue systems (Liu et al., 2021;Xu et al., 2019c).Figure 1 illustrates a toy example of cross-lingual EA between an English KG and a Japanese KG.The main challenge of this task is to leverage the variety of information in KG, such as entity semantics and relations.
In the deep learning era, embedding-based approaches have become the mainstream for ad-dressing the EA task, which primarily follows the "embedding-learning-and-matching" paradigm.As shown in the middle of Figure 1, the embedding module encodes entities from two KGs into a shared latent space.The matching module then infers equivalent entities from the embeddings.The basic principle behind embedding-based EA is that equivalent entities in different KGs share similar neighborhood information.Graph neural networks (Chang et al., 2023;Tang et al., 2022) have been widely adopted as KG encoders, which are usually trained by margin-based losses that encourage equivalent entities to have similar embeddings.
However, the design of the matching module has been overlooked in embedding-based EA.Many existing methods use a greedy strategy that matches entity embeddings to their closest counterparts in another KG, which only relies on the embedding module to incorporate structural information.Unfortunately, even the most powerful KG embedding models and graph neural networks fail to fully preserve structural information.Although some recent methods have attempted to improve the matching module by treating it as a global assignment problem (Mao et al., 2021) or an optimal transport problem (Luo and Yu, 2022), they still fall into the scope of embedding alignment and have limitations in utilizing KG structural information.
To overcome the above issue, we propose FG-WEA, an unsupervised EA framework based on the Fused Gromov-Wasserstein (FGW) distance (Titouan et al., 2019), which fuses entity embedding alignment (via the Wasserstein distance) and KG structure alignment (via the Gromov-Wasserstein distance) into a joint optimization framework.As shown in Figure 1, instead of only comparing entity embeddings as most embeddingbased EA methods did in the literature, the proposed FGWEA jointly incorporates both KG semantics and structure information.In fact, FG-WEA considers cross-KG structural and relational consistencies in optimization objectives to better exploit structural information, rather than implicitly encoding it into embeddings.Moreover, after shifting the inclusion of structural information to the matching module and relieving the workload of embedding module, FGWEA is more compatible with pre-trained language models, which only acts as a main tool for encoding semantic information.
As directly optimizing FGW leads to inefficiency and inferior performance, FGWEA executes a three-stage progressive optimization algorithm, which begins with a relatively simple semantic comparison and then moves on to a more challenging structural comparison.We further develop a fast approximation algorithm and an iterative multiview OT alignment module to efficiently compare the various KG information.Experiments on four cross-lingual and cross-source EA datasets demonstrate that FGWEA outperforms 21 existing EA methods, including both supervised and unsupervised state-of-the-art approaches.Entity Alignment (EA).Given two KGs G and G ′ , the EA task is to discover the set of equivalent entity pairs between G and G ′ , denoted as M = {(e, e ′ )|e ≡ e ′ , e ∈ E, e ′ ∈ E ′ }, where e ≡ e ′ means an equivalence relation between e and e ′ .In the unsupervised setting, the EA model predicts M without observing any pre-aligned entities.

Optimal Transport (OT)
The core concept of OT is to find a transportation plan (i.e., the coupling matrix) between two distributions that minimize the overall transportation cost.Let |E| = m and |E ′ | = n; we denote µ and ν as two discrete distributions on E and E ′ , respectively.For simplicity, we assume that µ and ν follow the uniform distribution.That is, where δ e i and δ e ′ j are the Dirac measure in e i and e ′ j , respectively.We use Π(µ, ν) to denote the set of all the joint distributions with marginals µ and ν: where π ij signifies the amount of mass transferred from e i in G to e ′ j in G ′ , 1 m denotes an m-dimensional all-one vector, and π1 m is the sum of each row in π.The coupling matrix π describes a probabilistic matching of entities between two KGs.A larger value of π ij indicates e i and e ′ j are more likely to be aligned.It is worth noting that when m = n and µ, ν follow a uniform distribution, (1) corresponds to the "assignment polytope", whose vertices correspond to the permutation matrices.
Wasserstein Distance (WD).WD is used for directly comparing two distributions, such as two sets of entity embeddings.The Wasserstein distance between µ and ν is defined as: where C ij represents the transportation cost between e i and e ′ j , e.g., the cosine distance between entity embeddings.We denote the objective in WD as f WD (C, π) = i,j C ij π ij := ⟨C, π⟩.

Gromov-Wasserstein Distance
The Gromov-Wasserstein Distance (GWD) (Peyré et al., 2016) is an extension of the classic OT problem, enabling the alignment of two graphs by solely comparing structures within each graph.Consider A and A ′ are adjacency matrices of G and G ′ , GWD is defined as: In this equation, if π ik and π jl have large values, it suggests that (e i , e ′ k ) and (e j , e ′ l ) are likely to be two entity pairs.Consequently, the corresponding intra-KG pairs (e i , e j ) and (e ′ k, e ′ l) should exhibit similar structures, i.e., |Aij − A ′ kl| → 0. If two KGs possess identical structures and π represents the perfect mapping between them, then GWD(A, A ′ ) = 0.

Fused Gromov-Wasserstein Distance (FGW).
Neither WD nor GWD is able to depict the full landscape of KGs.Therefore, FGW (Titouan et al., 2019) is introduced, whose objective is a linear combination of f WD and f GWD : (4) where α ∈ [0, 1] is a trade-off parameter.
However, several challenges emerge when applying FGW to the EA task.First, GWD assumes that both A and A ′ are homogeneous graphs, whereas KGs are heterogeneous graphs containing relational information.Second, KG entities possess various forms of side information, such as names and attributes, complicating the accurate measurement of entity similarity and the computation of the cost matrix C in WD.Third, although Titouan et al. (2019) invokes the Frank-Wolfe method for optimizing FGW, its effectiveness has only been confirmed on small graphs with hundreds of nodes.We observe that directly applying this method to large-scale sparse KGs results in unstable performance and reduced efficiency.To tackle these issues, we propose a novel EA approach based on FGW in the following section.

The Proposed Method
We present an unsupervised EA framework, FG-WEA, that performs entity matching based on the FGW distance.As shown in Figure 2, it comprises a semantic embedding module and a three-stage entity matching module.To address the aforementioned challenges, we propose a three-step progressive optimization algorithm.First, FGWEA performs the straightforward semantic embedding matching to obtain high-confidence aligned entity pairs as anchors (Section 3.1).Building on these anchors, FGWEA employs a fast approximation of GWD to compute cross-KG structural and relational similarities, which are then used for iterative multi-view OT alignment (Section 3.2).Upon achieving a better initial point for the coupling matrix, FGWEA proceeds to comparing the global structures of KGs by optimizing GWD, the most challenging component in FGW (Section 3.3).

Semantic Embedding and Comparison
The embedding module in FGWEA is responsible for encoding entity semantic information, primarily derived from entity names and attributes.Given the remarkable success of pre-trained language models, we employ LaBSE (Feng et al., 2022) for embedding multilingual KGs and SimCSE (Gao et al., 2021) for embedding monolingual KGs, both of which are variations of BERT-base (Devlin et al., 2019) and are tailored for semantic similarity modeling.It is important to note that our embedding module does not necessitate fine-tuning, and any pre-trained sentence Transformers can be used as

Gromov-Wasserstein Refinement
Figure 2: Framework Overview.The embedding module calculates name and attribute embeddings for each entity in KGs.The matching module consists of three stages: semantic comparison (Section 3.1), multi-view iterative OT alignment (Section 3.2), and Gromov-Wasserstein refinement (Section 3.3).
a substitute, such as those presented by (Reimers and Gurevych, 2019).
We represent the entity name of e i as ne i and concatenated all attribute triples related to e i into a single string denoted as ae i (in the form of a 1 l 1 a 2 l 2 • • • ).The order of the triples depends on the attribute frequency in the KG.Let enc(•) be the encoder function, we calculate the name similarity-based cost matrix C name and attribute similarity-based cost matrix C attr between two KGs as follows: In the first matching stage, we use the sum of two semantic similarity matrices as the cost in WD and calculate the initial coupling matrix π 0 by: Specifically, we use the Sinkhorn algorithm (Cuturi, 2013) to tackle this problem, and collect high confidence entity pairs in π 0 as anchors to facilitate the subsequent matching process.Let M 0 a denote the initial anchor set and c = 1/max(m, n) be the maximum potential value of π.We have where ϵ is a small threshold satisfying ϵ < c/2 to ensure one-to-one alignment.

Approximated GWD for Multi-view
Iterative OT Alignment In the second stage, our goal is to incorporate KG structural and relational information into the matching process.Instead of directly optimizing the GWD or FGW objective, we develop an approximate alternative for the sake of efficiency.
Relation-aware GWD We extebd the structural comparison where r i,j represents the relation between e i and e j .The relation similarity sim(r i,j , r ′ k,l ) = 1 if A ij = A ′ kl = 1 and r i,j ≡ r ′ k,l , otherwise 0. As the relation set in different KGs is also unaligned, we align these relations based on relation name similarity, using the same process in Section 3.1.
Approximation However, optimizing ( 7) is even more challenging than optimizing GWD.We simplify it by approximating π ik in equation ( 7) with a sparse coupling matrix π based on the anchor set M 0 a .Specifically, πik = c if (e i , e ′ k ) ∈ M a , and πik = 0 otherwise.Note that when M a is closer to the ground truth alignment, the approximation of GWD is more accurate.Afterward, ( 7) is converted to a WD objective: where S rel j,l reflects the relation similarity between e j and e ′ l .It is calculated by counting the number of anchors (e i , e ′ k ) ∈ M a in which e i is a neighbor of e j , e ′ k is a neighbor of e ′ l , and r i,j ≡ r ′ k,l .S rel can be efficiently computed by iterating through all anchor pairs and comparing their corresponding neighbor node pairs.Figure 3 illustrates the computation process.If (e 3 , e ′ 2 ) is an anchor and e 1 , e ′ 1 are corresponding neighbors with equivalent relations r 1 ≡ r ′ 1 , then (e 3 , e ′ 2 ) contributes to the relation similarity S rel 1,1 .In the right of Figure 3, we repeat this process to calculate the relation-agnostic structure similarity matrix S stru , which can be regarded as an approximation of GWD that only compares between anchor entity pairs and other pairs.

Multi-view OT Alignment
To perform a joint comparison of structures and semantics between KGs, we rescale 1 − cS rel and 1 − cS stru to a range of [0,1] and obtain the corresponding cost matrices C rel and C stru .The multi-view OT combines all four cost matrices that represent discrepancies between KGs from different perspectives: where We derive π * 1 and update the anchor set M 1 a with the same process in Section 3.1.With M 1 a , we can adjust C stru and C rel accordingly, resulting in a new OT problem and a new coupling matrix π * 2 .We repeat this process for a fixed number of epochs in order to gradually improve the completeness of the anchor set.The final coupling matrix in the second stage is denoted as π * OT .

Gromov-Wasserstein Refinement
Although the approximated GWD has the advantages mentioned above, the reliance on the anchor set may lead to accumulated error.Therefore, in the final matching stage, we consider the following FGW objective: (10) Due to the difficulty in optimizing f FGW discussed in 2.3, we only consider optimizing the second term f GWD to improve stability.We employ the Bregman Proximal Gradient algorithm, introduced by Xu et al. (2019a) and shown to have a local linear convergence guarantee by Li et al. (2022).For the k-th iteration, BPG takes the form where β is the step size and KL(•||•) is the Kullback-Leibler divergence.As such, the πupdate is identical to the entropic OT problem, and we can invoke the Sinkhorn algorithm to tackle it.Our GW refinement process incorporates two improvements to BPG.First, we use π * OT as the initial point rather than the uniform distribution, significantly facilitating the optimization process.Second, we employ the relative change of f FGW instead of f GWD as the optimization stopping criterion, which more accurately reflects the discrepancy between KGs.In the following section, we will test the effectiveness of our proposed FGWEA with the progressive optimization algorithm.multilingual datasets DBP15K (Sun et al., 2017) and SRPRS (Guo et al., 2019), and two monolingual multi-source datasets D-W-15K-V2 (Sun et al., 2020) and Med-BBK-9K (Qi et al., 2021).Statistics of these datasets are in Table 1.For a detailed description, please refer to Appendix A.
Evaluation Metrics.On DBP15K and SPARS, we use HitK and MRR to evaluate the performance of all EA methods.HitK calculates the percentage of entities in G whose counterparts in G ′ is in the top-K candidates of model output.MRR is the mean reciprocal rank.On D-W-15K-V2 and Med-BBK-9K, we adopt another evaluation protocol for a comprehensive evaluation suggested by Leone et al. (2022).We use the standard classificationbased metrics, i.e., precision (P), recall (R), and F 1 scores between the set of all predicted entity pairs and that of ground truth entity pairs.Implementation Details.Unlike most neuralbased EA methods, the proposed FGWEA requires no hyper-parameter tuning and we use the same hyper-parameters across all datasets.We update 6 epochs for multi-view OT alignment and set the threshold ϵ to 1e-5.In all places where the Sinkhorn algorithm is used, we set the entropic regularization weight η to 0.1 and the number of iterations to 10.We set α in the FGW objective (10) to be the average graph density of A and A ′ to maintain a balance between the magnitude of

Results on Cross-lingual EA Datasets
DBP15K is the most widely-adopted EA dataset.
Unfortunately, the experimental configurations of different baselines on this dataset are highly inconsistent, leading to unfair comparison.After a careful study of existing work, we figure out four factors that significantly effect the results: (1) the inclusion of entity names, (2) the utilization of attribute triples, (3) the use of Google translation for non-English entities, and (4) the ratio of entity links for supervision.
Based on factors (1-3), we categorize baselines into five groups and run FGWEA using the configurations for each group.The experimental settings and results of all compared baselines and FGWEA is in Table 2.As observed, FGWEA achieves the best performance in terms of Hit1 and MRR in all five groups.Specifically, the unsupervised FG-WEA outperforms two state-of-the-art supervised EA approaches BERT-INT and MCLEA.SelfKG and ICLEA are two graph neural network-based methods that use the same pre-trained language model named LaBSE to encode semantic information.However, our approach outperforms them by a significant margin, demonstrating its ability of utilizing KG structures.UED and CPL-OT, which are also based on OT for alignment, do not perform as well as FGWEA, suggesting that the FGW distance we introduced is more suitable for this task.Table 3 reports the results on the SPARS dataset.BERT-INT uses 30% entity links for training and other baselines are unsupervised.While most baselines rely on translated entity names to overcome the language barrier, FGWEA achieves the best performance with untranslated entity names.It surpasses LightEA, the current leading method on this dataset, by reducing the error rate from 1.2% to only 0.3% on SPRPS EN_DE .

Results on Cross-source EA Datasets
Cross-source EA poses more challenges than EA within the same knowledge source due to the larger discrepancies in schema and topology of KGs from different sources.For example, in D-W-15K-V2, we find the KG from WikiData uses OIDs as entity names.To facilitate semantic comparison in FG-WEA, we replace these OIDs with entity attributes that possess linguistic information.
In Table 4, we compare FGWEA with 7 EA methods that were not included in cross-lingual EA evaluation.The results show that FGWEA consistently outperforms all the baselines on two datasets in terms of precision, recall, and F 1 scores.Remarkably, FGWEA outperforms PARSE by 11.9% in terms of F 1 , which is the previous best performed method on this dataset.FGWEA also surpasses PARIS, a conventional approach that has shown superior performance to all neural-based EA in a recent study (Leone et al., 2022).

Ablation Study and Model Efficiency
To validate the effectiveness and efficiency of each component in FGWEA, we compare it with several ablations.First, we remove Gromov-Wasserstein refinement, the third matching stage in FGWEA, and refer to this new version as FGWEA w/o (without) GW.Then, we continue to remove the relational comparison and structural comparison in the second matching stage, and obtain FGWEA w/o C rel and C stru , respectively.GWD-only is a baseline that directly optimizes GWD for alignment without using the progressive optimization algorithm in FGWEA.Emb-Match directly matches entities based on entity semantic embeddings.
As shown in Table 5, FGWEA performs the best compared with these variants, which validates the effectiveness of the proposed progressive optimization algorithm.Removing GW refinement in FG-WEA results in a decrease in performance on all datasets and a significant reduction in computational time.Removing either the relational comparison or the structural comparison also leads to a decline in performance, while the time consumption does not change significantly.Besides, directly optimizing GWD between KG structures is ineffective for the EA task, and aligning entity semantic embeddings alone also has poor performance.This highlights the importance of considering structural and semantic information jointly.
Note that Table 5 only calculates the time spent on the matching module.The embedding module takes approximately 5 minutes to run on DBP15K and 3 minutes on other datasets.On average, it takes approximately 10 minutes to run FGWEA on these datasets, which is relatively efficient compared to most embedding-based EA methods.

Visualization of the FGW Objective
In figure 4, we visualize the objective function in (10) and the corresponding Hit1 score for 400 epochs in GW refinement on DBP15K ZH_EN without translation and attributes.We find a strong correlation between two curves-the iteration corresponding to the minimum FGW objective value is approximately that to the maximum Hit1 score.This suggests that the FGW objective can be utilized as an unsupervised metric to estimate the alignment performance and to help determine when to stop optimization in GW refinement.In this case, 5 Related Work

Unsupervised Entity Alignment
We categorize the existing unsupervised entity alignment methods into three groups: (1) Traditional heuristic EA systems.LogMap (Jiménez-Ruiz and Grau, 2011) and PARIS (Suchanek et al., 2011) are two well-known traditional EA systems that iteratively discover entity links by logical inference, lexical matching, and probabilistic reasoning.PARSE (Qi et al., 2021) is an enhanced version of PARIS which combines probabilistic reasoning and semantic embedding.
(2) Self-supervised neural EA methods.SelfKG (Liu et al., 2022) uses the graph neural network to aggregate entity embeddings of one-hot neighbors, and proposes a similarity metric between the entities of two KGs for contrastive learning.ICLEA (Zeng et al., 2022a) conducts bidirectional contrastive learning via building pseudo-aligned entity pairs as pivots for cross-KG interaction.
(3) Optimization-based non-neural EA methods.SEU (Mao et al., 2021) transforms the EA problem into assignment problem.LightEA (Mao et al., 2022b) is a non-neural framework which reinvents the label propagation algorithm to effectively run on KGs.Our proposed FGWEA also belongs to this group.

Optimal Transport for Entity Alignment
There have been a few approaches that use OT to improve the EA performance.OTEA (Pei et al., 2019) is a supervised method that adopts the basic TransE (Bordes et al., 2013)

Conclusion
In this paper, we propose an unsupervised entity alignment framework named FGWEA.Instead of following the "embedding-learning-andmatching" paradigm, we invoke the Fused Gromov-Wasserstein distance to realize a more explicit and comprehensive comparison of structural and semantic information between knowledge graphs.To realize the benefits of FGWEA, we present a threestage progressive optimization algorithm to address the challenge of optimizing the FGW objective.Experimental results show that FGWEA outperforms both supervised and unsupervised state-of-the-art entity alignment methods.

Limitations
Although the proposed FGWEA framework demonstrated the superior performance on multiple public EA datasets, there are still some limitations that require further research.
Scalability.In this paper, we have successfully extended FGW to KGs with tens of thousands of entities, which is the common size of domain-specific KGs.However, real-world general-domain KGs can be much larger and contain millions of entities.The most time-consuming step in FGWEA, the Gromov-Wasserstein refinement, has quadratic time complexity O(|E||T ′ r |+|E ′ ||T r |) and thus cannot be directly applied to million-scale KGs.There are three ways to further scale up FGWEA.First, we can remove the most time consuming step, GW refinement, while FGWEA still has competitive performance in Table 5.Second, we can use recent divide-and-conquer methods (Xin et al., 2022;Zeng et al., 2022b;Li et al., 2021) to divide large scale KGs into smaller subgraph pairs, and then apply alignment methods for each subgraph pair.Third, the coupling matrix π can be restricted to a sparse matrix which only considers top-k candidates for each entity, and the computation can be accelerated by mask OT (Gasteiger et al., 2021) or sparse Sinkhorn iteration (Mao et al., 2022b).
Dealing with dangling cases.FGWEA supposes all entities have equal probabilities to be matched in the beginning by using the uniform distribution.Therefore, it has limited ability to handle dangling entities whose counterparts are unavailable in the other KG (Sun et al., 2021).To avoid this limitation, we can invoke unbalanced OT (Chizat et al., 2018) or unbalanced GWD (Sejourne et al., 2021), which relax the assumption of equal probabilities for all entities.

A Dataset Description
All the datasets used in our evaluation are publicly available on the Internet.DBP15K1 consists of three subsets of cross-lingual KG pairs extracted from DBpedia: DBP15K ZH_EN (Chinese to English), DBP15K JA_EN (Japanese to English), and DBP15K FR_EN (French to English).Each KG pair contains 15,000 pre-aligned entity links.SRPRS2 is a sparse dataset that includes two cross-lingual KG pairs extracted from DBpedia: SRPRS EN_FR (English to French), and SRPRS EN_DE (English to German).Each subset of SRPRS also contains 15,000 entity links, but with fewer relation triples and no attribute triples.D-W-15K-V23 consists of two English KGs extracted from DBpedia and WikiData, respectively, and there are 15,000 pre-aligned entity links.MED-BBK-9K4 is an industry dataset containing two Chinese medical KGs with 9,162 entity links, one is an authoritative human annotated KG and the other is extracted from a Chinese online encyclopedia called Baidu Baike.D-W-15K-V2 is licensed under the GNU General Public License v3.0, while other datasets are licensed under the MIT License.

B More Examples of the FGW Objective
Same as Section 4.5, in Figures 5 and 6, we visualize the objective function in (10)  and SRPRS EN_FR .The observation is consistent with Section 4.5.Two curves are highly correlated and the iteration corresponding to the minimum FGW objective value is approximately that to the maximum Hit1 score.

C Additional Results
Several studies have pointed out that many entities in DBP15K can be directly matched by strings to obtain aligned entities (Liu et al., 2020).In light of this, we perform additional experiments on a hard test set split of DBP15K, as introduced in (Liu et al., 2020), to minimize the influence of name bias.Furthermore, to demonstrate that FGWEA's exceptional performance cannot be solely credited to the powerful LaBSE encoder, we use the mean pooling of bert-base-multilingual-cased5 as FGWEA's new semantic encoder.The embedding matching accuracy for this encoder is only 16.1% on the hard setting of DBP15K ZH_EN .Nonetheless, FGWEA continues to achieve competitive results as shown in Table 6, surpassing AttrGNN, the current top-performing method for this setting (Liu et al., 2020).

Figure 1 :
Figure 1: Top: A toy example of cross-lingual entity alignment.Middle and bottom: Comparison between embedding-based EA and our proposed FGWEA.
the set of entities, relations, attributes and literals, respectively.FollowingQi et al. (2021), a KG contains a set of relation triples T r = {(e i , r j , e k )} and attribute triplesT a = {(e i , a j , l k )}, denoted as G = (E, R, A, L, T r , T a ).Instances of both types of triples are 〈Pokémon, Publisher, Nintendo〉 and 〈Pokémon, FirstReleaseDate, 1996-02-27〉 in Figure 1.While attribute triples are an essential component in KG, some EA datasets simplify them by only considering the relation triples, i.e., G = (E, R, T r ).Besides, we denote the adjacency matrix of G as A, where A ij = 1 if e i and e j connected by at least one relation, and 0 otherwise.

Figure 3 :
Figure 3: Illustration of how anchor links contribute to S stru and S rel .

Figure 4 :
Figure 4: Visualization of the relationship between the objective function and alignment performance (Hit1) of FGWEA in the GW refinement process.

Figure 5 :
Figure 5: Visualization of the relationship between the objective function and alignment performance on DBP15K EN_JA .

Table 1 :
Dataset statistics.|E|, |R| and |T r | represent the number of entities, relation types and relation triplets in each KG, respectively.

Table 2 :
Evaluation Results of all compared EA methods on DBP15K under different configurations.Name, Attr., and Trans.represent the usage of entity name, attributes, and translation information, respectively.Sup.indicates the ratio of entity links for supervision.Methods marked with * use additional information not in DBP15K.

Table 3 :
Evaluation Results on the SPARS dataset.Methods marked with * used the translated entity name.theWD and GWD terms.We set the step size β in BPG to 100 and the maximum iteration number to 2000.The only exception is that we encounter numerical errors on the Med-BBK-9K dataset, and thus decrease β to 50.Our model is implemented on PyTorch.All experiments are performed on a Linux server with an AMD Ryzen9 5950X CPU and an NVIDIA GeForce RTX 3090 GPU.

Table 4 :
Results on cross-source EA datasets.

Table 5 :
Ablation study of FGWEA.The wall-clock time is measured in seconds.

Table 6 :
Results comparison bewteen FGWEA and At-trGNN on a hard setting of DBP15K.