Efficient Federated Learning on Knowledge Graphs via Privacy-preserving Relation Embedding Aggregation

Federated Learning (FL) on knowledge graphs (KGs) has yet to be as well studied as other domains, such as computer vision and natural language processing. A recent study FedE first proposes an FL framework that shares entity embeddings of KGs across all clients. However, compared with model sharing in vanilla FL, entity embedding sharing from FedE would incur severe privacy leakage. Specifically, the known entity embedding can be used to infer whether a specific relation between two entities exists in a private client. In this paper, we first develop a novel attack that aims to recover the original data based on embedding information, which is further used to evaluate the vulnerabilities of FedE. Furthermore, we propose a Fed erated learning paradigm with privacy-preserving R elation embedding aggregation (F ED R) to tackle the privacy issue in FedE. Compared to entity embedding sharing, relation embedding sharing policy can significantly reduce the communication cost due to its smaller size of queries. We conduct extensive experiments to evaluate F ED R with five different embedding learning models and three benchmark KG datasets. Compared to FedE, F ED R achieves similar utility and significant (nearly 2 × ) improvements in both privacy and efficiency on link prediction task.


Introduction
Knowledge graphs (KGs) are critical data structures to represent human knowledge and serve as resources for various real-world applications, such as recommendation and question answering (Gong et al., 2021;Liu et al., 2018). However, most KGs are usually incomplete and naturally distributed to different clients. Despite each client can explore the missing links with their own KGs by knowledge graph embedding (KGE) models (Lin et al., 2015), exchanging knowledge with others can further enhance completion performance because the overlapping elements are usually involved in different KGs (Chen et al., 2021;Peng et al., 2021). To exchange knowledge, the first federated learning (FL) framework for KG -FedE is recently proposed, where each client trains local embeddings on its KG while the server receives and aggregates only locally-computed updates of entity embeddings instead of collecting triplets directly (Chen et al., 2021). However, at the very beginning in FedE, the server should collect the entity sets of every client for entity alignment, which will lead to unintentional privacy leakage: 1) entity's information, such as the customer's name, is usually sensitive but it is fully exposed to the server; 2) the relation embedding will be inferred and be exploited for knowledge graph reconstruction attack if there exists the malicious server (see Section 3.1). Therefore, we propose FEDR that adopts relation embedding aggregation to tackle the privacy issue in FedE. The major difference is shown in Figure  1. Besides, the number of entities is usually greater than the number of relations in real-world graph databases, so sharing relation embedding is more communication-efficient.
We summarize the following contributions of our work. 1) We present a KG reconstruction attack method and reveal that FedE suffers a potential privacy leakage due to a malicious server and its colluded clients. 2) We propose FEDR, an efficient and privacy-preserving FL framework on KGs. Experimental results demonstrate that FEDR has the competitive performance compared with FedE, but gains substantial improvements in terms of privacypreserving effect and communication efficiency.

Background
Knowledge graph and its embedding. KG is a directed multi-relational graph whose nodes correspond to entities and edges of the form (head, relation, tail), which is denoted as a triplet (h, r, t). KGE model aims to learn low-dimensional representations of elements in a KG via maximizing scoring function f (h, r, t) of all embedding of triplets. In other words, as depicted in Figure  1, we can infer relation embedding in terms of r = arg max r f (h, r, t) given entity embeddings, but we cannot obtain t = arg max t f (h, r, t) merely based on known relation embedding r.
Federated learning and FedE. FL allows different clients to collaboratively learn a global model without sharing their local data (McMahan et al., 2017). In particular, the aim is to minimize: is the local objective that measures the local empirical risk of k-th client. Compared to model sharing in vanilla FL , FedE introduces a new mechanism that aggregates only entity embedding. More concretely, the server maintains a complete table including entity embeddings and the corresponding entity IDs, and the server can identify if an entity exists in a client for entity alignment.

Knowledge Graph Reconstruction
The purpose of knowledge graph reconstruction attack is to recover original entities and relations in a KG given traitor's information including parital or all triplets and the corresponding embeddings, namely element-embedding pairs. The attack procedure for FedE is summarized as follows (suppose there is a malicious server and one traitor): 1) The server colludes with one client C1 to obtain its element-embedding pairs (E, e), (R, r) .
2) Infer the target client's relation embedding by calculating r = arg max r f (h, r, t).
3) Measure the discrepancy between the inferred element embedding such as relation embedding r and all known r with cosine similarity. 4) Infer the relation R as R, E as E with corresponding largest similarity scores. Then the target client's KG/triplet can be reconstructed. More detials are included in Appendix A.
Privacy leakage quantization in FedE. We define two metrics: Triplet Reconstruction Rate (TRR) and Entity Reconstruction Rate (ERR) to measure the ratio of corretly reconstructed triplets and entities to the relevant whole number of elements, respectively. We let the server owns 30%, 50%, 100% trained element-embedding pairs from C1, the traitor, to reconstruct entities and triplets of others. The results of privacy leakage on FB15k-237 (Toutanova et al., 2015) over three clients are summarized in Table 1. LR in the table denotes information (element-embedding pairs) leakage ratio from C1. It is clear that the server only needs to collude with one client to obtain most of the information of KGs on other clients. In a word, FedE is not privacy-preserving.

FEDR
The overall procedure of FEDR framework is described in Algorithm 1. Before aggregation works, the server acquires all IDs of the unique relations from local clients and maintains a relation table  via Private Set Union (PSU), which computes the union of relations, without revealing anything else, for relation alignment (Kolesnikov et al., 2019). Hence, the server does not know the relations each client holds. The constructed relation table is then distributed to each client, and in each communication round, partial clients are selected to perform local training (see Appendix B.2) to update element embeddings E c that will be masked by the masking indicator M r,c and uploaded to the server later. Here M r,c i = 1 indicates the i-th entry in the relation table exists in client c. Considering that the server can retrive relations from each client by detecting if the embedidng is a vector of 0, we exploit Secure Aggregation technique (SecAgg, see Appendix C) in the aggregation phase as described in line 8 in Algorithm 1, where is element-wide division, ⊗ is element-wide multiplication, and 1 is an all-one vector. The fundamental idea behind SecAgg is to mask the uploaded embeddings such that the server cannot obtain the actual ones from each client. However, the sum of masks can be canceled out, so we still have the correct aggregation results (Bonawitz et al., 2017). Specifically, in FEDR, the server cannot access correct masking vectors v c and embeddings E r,c t+1 but only access the correct sum of them, namely, Ct c=1 v c and Ct c=1 E r,c t+1 , respectively. At the end of round t, the aggregated E c t+1 will be sent back to each client c ∈ C t for next-round update.

Experiments
We carry out several experiments to explore FEDR's performance in link prediction, in which the tail t is predicted given head h and relation r.
Datasets. We evaluate our framework through experiments on three public datasets, FB15k-237, WN18RR (Dettmers et al., 2018) and a disease database -DDB14 (Wang et al., 2021). To build federated datasets, we randomly split triplets to each client without replacement. Note that, random split makes data heterogeneous among all the clients, and ensures fair comparison between FedE and FedR.

Effectiveness Analysis
The commonly-used metric for link prediction, mean reciprocal rank (MRR), is exploited to evaluate FEDR's performance. We take FedE and Local, where embeddings are trained only on each client's local KG, as the baselines. Table 2 shows the link prediction results under settings of different number of clients C. We observe that FEDR comprehensively surpasses Local under all settings of the number of clients, which indicates that relation aggregation makes sense for learning better embeddings in FL. Take NoGE as an example, FEDR gains 29.64±0.037%, 22.13±0.065%, and 11.84 ± 0.051% average improvement in MRR on three dataset. Compared with FedE, FEDR usually presents the better or similar results with the KGE models of DistMult and its extensive version ComplEx on all datasets. We also observe that both entity and relation aggregations succeed in beating Local setting but gain marginal improvement with DistMul and ComplEx on DDB14 and WN18RR datasets. Specially, KGE models fails to obtain reasonable results in federated with ComplEx. A potential reason could be that the averaging aggregation is not suitable for complex domains especially on the extremely unbalanced data (w.r.t number of unique entities and relations in a KG). Although FedE performs better than FEDR with TranE and RotatE, the absolute performance reductions between FedE and FEDR are mostly (13/16 = 81%) within 0.03 in MRR on both DDB14 and FB15k-237, which illustrates that FEDR is still effective. The theoretical explanations behind these results w.r.t data heterogeneity, and characteristics of FL and KGE models need further studies.
To further assess relation aggregation strategy, we compare performance of different KGE models regarding Hit Rates, which is shown in Figure  2. Similar to MRR, Hit Rates drop with the increasing number of clients because of the more sparse knowledge distribution. All KGE models behave well and consistently on DDB14 dataset while there are large deviations of performance between each model on WN18RR and FB15k-237. This phenomenon is attributed to the biased local knowledge distribution, which is implicitly shown by the number of local entities.

Privacy Leakage Analysis
Compared with entity aggregation, additional knowledge is required to perform reconstruction attack in FEDR because it is almost impossible to infer any entity or triplet from relation embeddings only. Therefore, we assume the server can access all entity embeddings without entity's IDs from clients. For simplicity, we let the server holds all information from C1, which is the same as the attack in Section 3.1 (LR=100%). The difference of adversary knowledge in FedE and FEDR is outlined in Table 3 Table 3: Summary of adversary knowledge. "G" represents "Global", "L" represents "Local". "EE" and "RE" represent entity and relation embeddings, respectively. Table 4 presents the privacy leakage quantization in FEDR over three clients. The results shows that relation aggregation can protect both entity-level and graph-level privacy well even if providing additional local entity embeddings without considering encryption techniques. In addition, we observe that despite the relation embedding can be exploited directly in FEDR instead of inference, the privacy leakage rates in FEDR are still substantially lower than the ones in FedE. For example, according to

Communication Efficiency Analysis
In this section, the product of data sizes and communication rounds is calculated to measure the communication cost. Considering the performance difference between FEDR and FedE, for fair com- parison of communication efficiency, we count the rounds when the model reaches a pre-defined MRR target on the validation dataset. Specifically, we set two different MRR targets: 0.2 and 0.4. Since all models perform well on DDB14, we take the setting with C = 5 on DDB14 as an example in this section. The required rounds for each model are depicted in Figure 3. We observe that FEDR reaches the target with much less rounds compared with FedE. For instance, FEDR-DistMult reaches the target MRR = 0.4 within 10 rounds while FedE uses 45 rounds. Also, according to statistics of federated datasets in Table 5, the average of the number of unique entities in FedE and unique relations in FEDR are 4462.2 and 12.8, respectively. We use the number of entities/relations to reflect data size, and by using relation aggregation, 99.89 ± 0.029% of cost is reduced in average for all clients when the target MRR is 0.2, while 99.90 ± 0.042% of cost is reduced in average when the target MRR is 0.4. These results demonstrate that our proposed framework is more communication-efficient.

Convergence Analysis
The convergence curves considering four KGE models and three dataset are shown in Figure 4.

Conclusion and Future Work
In this paper, we conduct the first empirical quantization of privacy leakage to federated learning on knowledge graphs, which reveals that recent work, FedE, is susceptible to reconstruction attack based on shared element-embedding pairs when there are dishonest server and clients. Then we propose FEDR, a privacy-preserving FL framework on KGs with relation embedding aggregation that defenses against reconstruction attack effectively. Experimental results show that FEDR outperforms FedE w.r.t data privacy and communication efficiency and also maintains similar utility.
In real-world applications, different organizations may use different KGE models, which may influence overall performance by embedding aggregation, how to design an effective FL framework in this case and how to perform KG reconstruction attack/defense are our future research directions.

A Knowledge Graph Reconstruction
We summarize the knowledge graph reconstruction attack in Algorithm 2. Note that in the algorithm, i) and ii) refer to different operations, and only one will be performed in FedE or FEDR.
Algorithm 2: Knowledge graph reconstruction including attack in FEDE/FEDR. Adversary knowledge: Local entity embeddings -LEE, local relation embeddings -LRE, element-embedding paris from a client -EEP, type of the used KGE model.  If not specified, the local update epoch is 3, the embedding dimension of entities and relation is 128. Early stopping is utilized in experiments. The patience, namely the number of epochs with no improvement in MRR on validation data after which training will be stopped, is set as 5. We use Adam with learning rate 0.001 for local model update. All models are trained using one Nvidia 2080 GPU with 300 communication rounds at maximum.

B.1 Statistics of Datasets
To build federated datasets, we randomly split triples to each client without replacement, then divide the local triples into the train, valid, and test sets with a ratio of 80/10/10. The statistics of datasets after split is described in Table 5.

B.2 Client Update
The client update, or loca knowledge graph embedding update, corresponds to Update(c, E) in Algorithm 1 starting from line 9, which learns both embeddings of entities and relations.
For a triplet (h, r, t) in client c, we adopt the selfadversarial nagative sampling (Sun et al., 2019) for effectively optimizing non-GNN KGE models: where γ is a predefined margin, σ is the sigmoid function, f is the scoring function that varies as shown in Table 6, and (h, r, t i ) is the i-th negative triplet, which can be sampled from the following distribution: where α is the temperature of sampling. There would be E epoches of traning on the client at a round to update local-view embeddings E including entity and relation embeddings, but only local relation embeddings {E r,c } will be sent to server. For NoGE, we follow its plain design by minimizing the binary cross-entryopy loss function: where G and G are collections of valid and invalid triplets, respectively.

B.3 Scoring Function
Model Scoring Function

C Secure Aggregation in FEDR
In this section, we illustrate how SecAgg works in FEDR through a simple exmaple including three clients with two relations. Mathematically, we assume the distribution of relation embeddings as R 1 = {r 1 }, R 2 = {r 2 } and R 3 = {r 1 }, respectively. After PSU, the server will obtain a set of relations R = {r 1 , r 2 }. Besides, we denote the corresponding masking vectors as M 1 = (1, 0), M 2 = (0, 1) and M 3 = (1, 0). In one communication round, once all clients complete local training and prepare for the aggregation phase, via Diffie-Hellman secret sharing (Bonawitz et al., 2017), each client u generates s u,v randomly for every other client, and they agree on the large prime number l. Then each party u compute the masked value t u for its secret vector s u , where s u := {R u , M u }, shown as below: where s u,v = s v,u for a specific condition, e.g. s 1,2 = s 2,1 . Therefore, each client holds its masked matrix as follows: t 1 = s 1 + s 1,2 + s 1,3 (mod l), t 2 = s 2 + s 2,3 − s 2,1 (mod l), t 3 = s 3 − s 3,1 − s 3,2 (mod l), Next, these masked matrices are uploaded to the server. Now the server cannot obtain the actual information from clietns but could extract the correct aggregated value via:

D Additional Results
In this section, we introduce additional experimental results of KB-GAT in a federated manner for link prediction.

D.1 Experiment result with KB-GAT
Since the aggregated information is not exploited in the local training in NoGE, we also implement KB- GAT (Nathani et al., 2019), the other GNN model but it can take advantages of both graph structure learning and global-view information aggregation. However, Fed-KB-GAT is memoryconsuming. For KB-GAT, we use GAT (Veličković et al., 2018) as encoder andConvKB (Nguyen et al., 2018) as decoder. Although the input to KB-GAT is the triple embedding, this model update neural network weights to obtain the final entity and relation embeddings. In each communication, we let the aggregated embeddings be the new input to KB-GAT, we find using small local epoches lead to bad performance because the model is not fully trained to produce high-quality embeddings. Therefore, we set local epoch of GAT layers as 500, while local epoch of convlutional layers as 150. Embedding size is 50 instead of 128 like others since we suffers memory problem using this model.
We conduct KB-GAT with both entity aggregation and relation aggregation on DDB14 with C = 3 as shown in Table 7. Due to the good performance of RotatE, we also compare KB-GAT with  RotatE. Hit@N is also utilized in the evaluation. From the table, KB-GAT beats RotatE in regard of all evaluation metrics in both FedE and FedR setting. However, how to implement federated KB-GAT in a memory-efficient way is still an open problem.