Temporal Extrapolation and Knowledge Transfer for Lifelong Temporal Knowledge Graph Reasoning

Real-world Temporal Knowledge Graphs keep growing with time and new entities and facts emerge continually, necessitating a model that can extrapolate to future timestamps and transfer knowledge for new components. Therefore, our work first dives into this more realistic issue, lifelong TKG reasoning, where existing methods can only address part of the challenges. Specifically, we formulate lifelong TKG reasoning as a temporal-path-based reinforcement learning (RL) framework. Then, we add temporal displacement into the action space of RL to extrapolate for the future and further propose a temporal-rule-based reward shaping to guide the training. To transfer and update knowledge, we design a new edge-aware message passing module, where the embeddings of new entities and edges are inductive. We conduct extensive experiments on three newly constructed benchmarks for lifelong TKG reasoning. Experimental results show the outperforming effectiveness of our model against all well-adapted baselines.


Introduction
Knowledge Graphs (KGs) are constructed to store structured facts about human knowledge or the objective world, and formalize facts as entities e (nodes) and relations r (links) between them.Static Knowledge Graphs (SKGs) and Temporal Knowledge Graphs (TKGs) are two typical forms of KGs.SKGs store facts in the form of triples (e s , r, e o ) and TKGs extend triples to quadruples (e s , r, e o , t), where t indicates the happening time.Since realworld events are usually ever-changing and associated with time, TKGs are naturally confronted with issues of continually emerging entities and facts in the future timestamps throughout their whole lifecycle (Chen et al., 2023a).Therefore, this paper investigates TKG link prediction task over incomplete TKGs in the lifelong setting, named lifelong *Corresponding Author TKG reasoning.Figure 1 is an example in dataset ICEWS14 for temporally growing TKGs.
However, SKG reasoning methods (Trouillon et al., 2016) lack the consideration of temporal changing; conventional transductive TKG reasoning methods (Lacroix et al., 2020) need re-train for their closed-world assumption; and the latest inductive TKG reasoning methods (Chen et al., 2023b) treat emerging entities as simultaneous, oversimplifying the real scenario and thereby leading their genuine performance to be questionable.Hence, our proposed lifelong TKG reasoning issue is more challenging and realistic.
We formulate the lifelong TKG reasoning as a temporal-path-based RL framework and design the whole pipeline for extrapolating, transferring and updating.In the following, we introduce our targeted solutions and expound on their advantages over existing methods.
First, we focus on the temporal displacement between timestamps of candidate edge and its preceding edge and add temporal displacement into RL action space.TKGE models (Xu et al., 2021) rely on embeddings of absolute timestamps and are only fit for the past timestamps.Obviously, they do not meet the requirements of lifelong learning.On the contrary, our used transferable temporal dis-placement in RL can be extrapolated from known timestamps to arbitrary future timestamps based on the magnitude of the displacement.In addition, we further design a reward-shaping module based on temporal rules found by RL, which only have the temporal order constraints of relations.This module makes the reasoning get rid of particular entities and will still be applicable for future timestamps.
Secondly, lifelong TKG reasoning can be considered as multiple consecutive inductive TKG reasoning processes.Recently, inductive TKG reasoning methods (Park et al., 2022;Xu et al., 2023) can only deal with future time, not new components, not to mention their ability to continuously learn as required in lifelong TKG reasoning.Therefore, we design a new edge-aware message passing module, which not only transfers learned relation types to initialize emerging entities, but also updates the embeddings of all entities and edges in new TKGs.We also use the embeddings of specific edges rather than immobile relation types, since we seek to explore the concrete environment for each fact to counteract the influence of rapid TKG growth.
We build three new benchmarks based on three popular datasets to simulate the lifelong TKG reasoning scenario.In the experiments, we carefully adapt existing baselines by empowering them with temporal extrapolation or knowledge transfer capabilities.In summary, our main contributions are: • To our knowledge, we are the first to pose and explore the more challenging and realistic lifelong TKG reasoning issue, which simulates growing TKGs in terms of timestamps, entities and facts, and we formulate it as a RL-based framework.
• To solve the challenges of temporal extrapolation and knowledge transfer in lifelong TKG reasoning, we propose the targeted solutions: temporal displacement, temporal-rule-based reward shaping and an edge-aware message passing module.
• We build three new benchmarks to evaluate our model.Experiments on temporal link prediction show that our model not only achieves the best average performance but also has progressively improving results on growing TKG snapshots.
2 Related Works

Inductive SKG Reasoning
Traditional SKG reasoning models, such as SKG Embedding (SKGE) methods, focus on the trans-ductive setting where they are trained and tested in a fixed set of components.Recently, inductive SKG reasoning has drawn much attention.GraIL (Teru et al., 2020), SE-GNN (Li et al., 2021a) and MaKEr (Chen et al., 2022) are all GNNbased inductive reasoning models, from the points of view of subgraphs, data relevance and metalearning.PathCon (Wang et al., 2021) leverages relational message passing for relation prediction, however, we aim at the harder entity prediction.Moreover, MINERVA (Das et al., 2018) first introduces RL to search for the tail entity of each query end-to-end.Multi-Hop (Lin et al., 2018) advances MINERVA and does reward shaping based on SKGE methods.

Inductive TKG Reasoning
Inductive TKG reasoning models mainly deal with seen entities in the future time.xERTE (Han et al., 2021) is delicately designed to forecast future links by an iterative sampling of temporal neighbours.TGAP (Jung et al., 2021) introduces temporally relevant events in GNN for better explainability.
For RL-based TKG reasoning, CluSTeR (Li et al., 2021b) regards RL as a clue searching stage, but it strips temporal dimension away from RL and then rearranges the clues in chronological order at the next temporal reasoning stage.However, they can not handle unseen entities emerging with time.
TITer (Sun et al., 2021) further defines a relative time encoding to distinguish the same entity in different timestamps and leverage the query information to represent unseen entities.Different from the above models, we introduce temporal displacement of facts in the RL and propose relation-type-based knowledge transferring for emerging entities.

Lifelong KG Reasoning
Recently, how to retain and reuse previous knowledge in a new environment has become a research highlight (Wang et al., 2019).MBE (Cui et al., 2022) explores inductive SKG reasoning under the multi-batch emergence scenario, which is similar to the concept of lifelong KG reasoning without considering time.Next, LKGE (Cui et al., 2023) first formally studies lifelong SKG reasoning via transferring knowledge and using TransE (Bordes et al., 2013) as the base model.However, they did not pay attention to the crucial temporal factor in TKGs.To this end, we raise lifelong TKG reasoning, which involves both unseen components and future timestamps, making this issue challenging and realistic.

Preliminaries
Growing TKGs in lifelong TKG reasoning can be viewed as a sequence of ρ snapshots: G = (G 1 , G 2 , . . ., G ρ ), each of which, G i , contains a collection of fact quadruples over a continuous time period and relation, timestamp and fact sets, and We use E ∆i+1 = E i+1 − E i and D i+1 to denote the emerging entities and facts.The TKG link prediction task asks to predict the missing entities of the query edge in incomplete TKGs.
For lifelong TKG reasoning, we leverage the TKG link prediction task above to train a new model M i+1 by transferring and updating knowledge in After finishing the training on F i and validation on V i , the model M i is evaluated on the accumulated test sets ∪ i j=1 Q j for overall assessments.

Model Overview
Figure 2 is the architecture of our model.Along a sequence of growing TKGs, our model transfers and updates knowledge iteratively from the previous TKG snapshot to the next one, avoiding re-training.Inside each G i , we regard the reasoning as walk-based action selecting process.An agent starts from the query entity, constantly takes actions through temporal edges based on temporal displacement, and expects to reach the target entity within a limited number of steps (Section 3.3).To achieve knowledge transferring and updating, we inject embeddings of relation types into emerging entities e new , and then update all the embeddings of entities e and edges g in our proposed edge-aware message passing module (Section 3.4).Section 3.5 describes our designed temporal-rule-based reward shaping.

Reinforcement Learning Framework
For each edge, we add its reversed edge to G i , making the reasoning traceable and controllable.For each entity e, we also add self-loop temporal edges at every timestamp to G i , which allows the agent to stay in a place, and they work as stop actions.

Environment Setting
Our environment can be formulated as a Markov Decision Process (MDP) over TKGs and has the following components.We take G i as an example.
States.Let S i and (e q , r q , ?, t q ) denote all possible states of G i and the query.At step m ∈ [0, M ], the agent locates at entity e m and timestamp t m , so the state s m = (e m , t m , e q , r q , t q ) ∈ S i .Specifically, the initial state is s 0 = (e q , t q , e q , r q , t q ).Time-constrained Actions.Let A i denote the action space of G i .Let A m i denote the set of optional actions of s m in G i .Compared with SKGs, the time dimension causes the action space of RL in TKGs extremely large.Hence, we add two temporal constraints to prune the action space, since facts closer to t m in the state s m under consideration are more likely to be directly relevant to the prediction: where g ′ is a candidate edge, t m −t ′ is the temporal displacement, T is a hyperparameter.A m i naturally considers reversed and self-loop temporal edges.
Transitions.The transition function ω : S i × A i → S i is deterministic under G i and updates the states depending on the selected actions.
Rewards.In the default formulation, agents receive reward R b (s M ) = I(e M == e ans ), where s M = (e M , t M , e q , r q , t q ) is the final state, e ans is the answer to the query and I is a binary indicator function.Our designed reward R(s M ) shaped by temporal rules will be described in Section 3.5.

Policy Network
First, the temporal displacement between timestamps of current state t m−1 (t 0 = t q ) and its subsequent action t m can integrally capture the timerelated dynamics.The temporal displacement is donated as δt m = t m−1 − t m ≤ T .Secondly, because of temporally evolving TKGs, even if the relation types of two edges in G i and G i+1 are the same, their semantics can change considerably due to different surrounding environments.It is only by taking surrounding edges into account that we can understand their contextual semantics.Moreover, the above operations are also in line with the foundation of RL, which is constant interaction with the environment.
Therefore, the input of policy network has three parts: u em , u gm , τ δtm ∈ R d , i.e., the embeddings of entity e m , edge g m = (e m−1 , r m , e m ), and temporal displacement δt m (u em , u gm are obtained from edge-aware message passing module in Section 3.4, τ δtm is transferred from G i−1 and updated in G i ).Path history h m in G i is encoded as follows: where h m ∈ R 2d , u r 0 ∈ R d is the embedding of the special starting relation r 0 , and δt q = 0.For a candidate next action a ′ = (e ′ , g ′ , δt ′ ) ∈ A m i (δt ′ = t m −t ′ ) in Eq. 1, we calculate the probability of its state transition based on the correlation of the action and the query in terms of entities and edges: where W e , W g , W q , W λ are learnable matrices, ⟨•, •⟩ is vector dot product.After scoring all actions, policy network π θ (a m+1 |s m ) is obtained through softmax.

Training and Optimization
We fix the search path length to M .In lifelong TKG reasoning, the policy network is trained by maximizing the expected reward over growing TKG snapshots G 1 , G 2 , . . ., G ρ .Hence, our model is required to train over F 1 , F 2 , . . ., F ρ in turn: where i ∈ [1, ρ].Then, we use the REINFORCE algorithm to iteratively optimize our model: R(s M | e q ,r q ,t q ) logπ θ .(5)

Embedding Transfer and Update
SKGs with static entity properties can be seen as long-term valid knowledge and are helpful to generate accurate evolutional embeddings of entities (Li et al., 2021c;Niu and Li, 2023).Therefore, for each TKG snapshot G i , the timestamps are masked to convert G i to its corresponding SKG snapshot G s i .
We adopt a relation-type-based transferring layer over G s i , since the relation types of connected edges provide valuable clues about their natures.Our introduced transferring layer injects learned knowledge about relation types into new entities.Formally, for a new entity e in G i , we generate its beginning embedding u b i (e) ∈ R d : W in , W out ∈ R d×d are two learnable weight matrices.u r ∈ R d , serving as the embedding of relation type r, is learnable throughout the whole lifelong TKG reasoning process.Furthermore, to avoid recalculating for seen entities, we only generate embeddings for emerging entities and inherit the embeddings for seen ones from the preceding TKG snapshot G i−1 .
In order to update embeddings for all components in G s i , we propose a new edge-aware message passing module via bidirectional communication between edges and nodes.This module enables our model to better adapt to the rapidly changing environment in lifelong TKG learning.For each edge g in G s i , the links connected to its two endpoints act as a relevant semantic environment.
Therefore, we alternately pass edge-aware messages between nodes and edges to aggregate unique environment knowledge for each edge as follows.u ℓ i (e) and u ℓ i (g) are embeddings of entity e and edge g at ℓ-th layer: where ), e lef t and e right are two endpoints of edge g.W ℓ self , W ℓ in , W ℓ out ∈ R d×d ; W ℓ g ∈ R d×3d and b ℓ g ∈ R d are learnable weight matrices.Message transition function φ u ℓ i (e ′ ), u ℓ i (g ′ ) =u ℓ i (e ′ ) • u ℓ i (g ′ ) stores environment knowledge by calculating the correlation between connected entities and edges.• is hadamard product.
After L-layer updating, the final representations of each entity e and edge g are u L i (e) and u L i (g).In the absence of ambiguity, we abbreviate them in RL (Section 3.3) as u e and u g , respectively.

Temporal-Rule-Based Reward Shaping
For a query (e q , r q , ?, t q ) with answer e ans , RL gives a reasoning trajectory ((e q , r 1 , e 1 , t 1 ), (e 1 , r 2 , e 2 , t 2 ), . . ., (e M −1 , r M , e M , t M )), where t q ≥ t 1 ≥ t 2 ≥ • • • ≥ t M .Then, we can extract a temporal rule R : (r M , . . ., r 2 , r 1 ) ⇒ r q with nondescending temporal constraints and denote the confidence of R as conf (R).According to Section 3.3.1,since the agent receives a binary reward only based on whether e M is equal to e ans , regardless of the quality of the reasoning temporal paths, we introduce a temporal-rule-based reward shaping to guide the training of the agent: conf (R) is obtained by dividing the rule support by the body support.

Constructed Benchmarks
To conduct evaluation for lifelong TKG reasoning, we construct three new TKG benchmarks based on three datasets ICEWS14, ICEWS05-15 (García-Durán et al., 2018) and ICEWS18 (Jin et al., 2020) to simulate their growth situation of entities, named as ICEWS14-lifelong, ICEWS05-15-lifelong and ICEWS18-lifelong.Table 1 shows the statistics of the new benchmarks.
1. Counting.More uniform entity distribution makes the changing of G more concentrated and significant, so we filter entities that occur less than 10 times and count the remaining entities, relations, timestamps as |E|, |R|, |T |.
2. Lifelong Simulating.New entities emerge at almost all the timestamps.Hence, we accumulate the number of entities in chronological order.First, we define the set of entities at t 0 = 0 as the initial E i (i = 1, . . ., 5).Secondly, we iteratively add the set of entities at the next timestamp to expand E i .Once |E i | ≥ 4+i 10 |E|, we stop expanding E i , record the current timestamp t i and then start the searching for t i+1 in the same way.Thirdly, after obtaining five timestamps t 1 ∼ t 5 , we denote the union of TKGs from t i−1 to t i as TKG snapshot G i .Since the relations are dense, we can ensure the number of relations in all G i , |R i |, is equal to |R|.For the last TKG snapshot G 6 , we set t 6 = |T | to ensure all facts in D are covered.
3. Dividing.For each TKG snapshot G i , we randomly divide G i into training set F i , validation set V i and test set Q i with ratio 3:1:1.
We compare our model against the SOTA lifelong SKG reasoning model LKGE (Cui et al., 2023).LKGE uses TransE as the knowledge transfer module and can be well-adapted to lifelong TKG reasoning.MBE (Cui et al., 2022) is designed for multi-batch emergence scenario and is also a powerful baseline since it leverages walk-based reasoning and has an inductive entity encoding module.We name MBE adapted for lifelong TKG reasoning as L-MBE.
Based on the framework of LKGE, we leverage the SOTA TKGE model TeRo (Xu et al., 2020) for L-TeRo, by defining the timestamp embeddings as a linear function.L-TITer, L-TGAP and L-TLogic were modified from TITer (Sun et al., 2021), TGAP (Jung et al., 2021) and TLogic (Liu et al., 2022).TITer is encapsulated according to the requirements of lifelong TKG reasoning, forming L-TITer.TGAP can handle future timestamps, but it is not inductive, so L-TGAP randomly initializes the emerging entities.TLogic is based on entityindependent temporal logical rules, so L-TLogic can be used for lifelong TKG reasoning by transferring its found rules.Evaluation Metrics.
Following the convention, we conduct the experiments on the temporal link prediction task and report Mean Reciprocal Rank (MRR) and Hits@k (k = 3, 10).Then, we evaluate the knowledge lifelong learning capability using forward transfer (FWT) and backward transfer (BWT) (Lopez-Paz and Ranzato, 2017).
where ρ is the number of TKG snapshots, h i,j is the MRR scores on Q j after training the model M i on F i .

Results on Lifelong TKG Reasoning
We run 10-seed experiments for all models on our three new benchmarks and report average results on the six TKG snapshots.The MRR, Hits@3 and Hits@10 results are shown in Table 2.Although the TransE used in LKGE is efficient to model static relationships of entities, it is not sensitive to temporal change.Therefore, LKGE does not deal with the temporal dimension in lifelong TKG reasoning and confuses the timestamps in growing TKGs.As a result, LKGE only has comparable Hits@10 results.Similar to our model, the adapted L-MBE uses a RL framework, benefiting its reasoning process.The static link augmentation in L-MBE is also proven to be effective in TKG reasoning (Niu and Li, 2023).Even so, the results of L-MBE are still worse than ours.The reason is that our model specifically constructs a more targeted RL environment for lifelong TKG reasoning and further takes temporal displacement into account for temporally extrapolating.
However, the frameworks of LKGE and L-TeRo are based on traditional KGE and TKGE models, limiting their performances in lifelong TKG reasoning.L-TITer, L-TGAP and L-TLogic are adapted from recent powerful TKG link prediction models, but there is a certain degree of reduction in their original performances when used in lifelong TKG reasoning.The reasons are that the IM module in TITer is parameter-free, so the knowledge can not be transferred and updated; TGAP is in the interpolation setting, so the embeddings of future timestamps can not be well initialized; TLogic relies on temporal rules, but the transferred old rules may conflict with the new ones in future TKG snapshots.Therefore, these existing baselines can not perform perfectly in the lifelong TKG reasoning.On the contrary, our model consistently outperforms all the baselines across the three benchmarks.Some results of our model are even twice that of baselines.This is because our model comprehensively considers all the requirements of lifelong TKG reasoning and the proposed three targeted solutions solve the problems of temporal extrapolation, knowledge transfer and knowledge update.

Performance Evolution
To demonstrate the performance evolution of our model and three baselines during lifelong TKG reasoning, the MRR results on all TKG snapshots are reported in Figure 3.We find: (i) For the same G i , as our model learns over growing TKGs, the performance of M i ∼ M 6 on G i remains relatively steady.This suggests that our model has a strong ability to transfer embeddings to emerging components continuously and avoids forgetting previous knowledge.However, the other three baselines all experience rapid degradation in performance and suffer from catastrophic forgetting.
(ii) For the same M i , we observe its performance changing on different TKG snapshots.Since old knowledge is not always suitable for new facts, we need to update learned knowledge, otherwise the performance of M i will drop.The inductive embedding layer in L-MBE sometimes succeeds in updating (from G 2 to G 3 ).LKGE and L-TGAP clearly fail to update knowledge for the decreasing MRR.This demonstrates our powerful ability to  update knowledge for new TKGs.Empirically, performance on G 1 is the upper bound of our model since G 1 injects knowledge to our model and knowledge transfer and update start when G 1 grows to G 2 .

Knowledge Lifelong Learning Capability
To accurately quantify the knowledge lifelong learning capability of all models in lifelong TKG reasoning, we report the FWT and BWT scores of MRR results in Figure 4. Since we specifically design the whole reasoning pipeline for extrapolating and transferring, the FWT scores of our model are the best among all baselines.LKGE and L-MBE are originally suitable for lifelong KG reasoning, so they work relatively well over SKGs.The FWT scores of L-TLogic are also poor because its rulebased strategy is too restricted by symbolic relations to reason over future TKGs.
The BWT scores of baselines are all negative due to the overwriting of relation types embeddings as the result of their update of learned knowledge.On the contrary, our proposed edge-aware message passing module focuses on updating the unique environment of each fact rather than individual relation types, which makes our model introduce richer information for more accurate predictions.TKGE in lifelong TKG reasoning.L-TITer and L-TGAP are well-adapted baselines by combining their original TKG reasoning ability with the demands of lifelong TKG reasoning, so their BWT scores are better than other baselines.

Case Study
Reasoning Trajectories and Temporal Rules.
To demonstrate the reasoning ability of our model over the temporal edges, we perform a case study in Table 3, which shows two positive reasoning trajectories and a negative one for the target query (Sudan, sign formal agreement, Ethiopia, 2014/12/25).
Then we further give the scores of confidence of the corresponding temporal rules extracted from the reasoning trajectories.The length of paths is set to 3 and we remove the self-loop actions for clarity.
It can be seen that our model succeeds in telling the reasonable temporal rules with high confidence from weak ones with low confidence and thereby guides the action selection stably and efficiently.

Ablation Study
To further examine the effect of the three proposed solutions for lifelong TKG reasoning in our model, we conduct an ablation study as shown in Table 4. First, we replace temporal displacement in RL with timestamps (w/o td), i.e., rely on embeddings of explicit timestamps while selecting actions.The MRR results decrease by 18.07% on average over three benchmarks, indicating the significance of considering temporal displacement in RL, because it can be temporally extrapolated to the future time.
Secondly, we remove our proposed edge-aware message passing module (w/o mp) and randomly initialize new entities.In this case, our model can not transfer or update knowledge and the agent can not capture the specific environment of each edge in RL.This leads to a drop of 10.86% on average on MRR, implying the importance of this module.
Finally, we leverage the original binary reward to replace the temporal-rule-based reward shaping in our model (w/o rs) and obverse a 7.77% performance degradation.This phenomenon means, by the training guidance of temporal rules independent of particular entities, our model obtains comprehensively enhanced ability to select reliable actions.

Conclusion
In this work, the lifelong TKG reasoning issue involves continually emerging entities and facts in the future timestamps.To address this new problem, we propose our model under the framework of RL and our model uses temporal displacement in the action space to extrapolate to the future timestamps; uses a new edge-aware message passing module to inductively transfer and update learned knowledge to new entities and facts; and uses a temporal-rule-based reward shaping to guide the training.The experimental results on three newly constructed benchmarks illustrate that our model has the best performance for lifelong TKG reasoning and the strongest knowledge lifelong learning capability.

Limitations
This paper mainly focuses on lifelong TKG reasoning, where we consider emerging entities and facts in the future timestamps as TKG grows, and do not consider the changing case of relations.In most cases, the number of entities in TKGs is much larger than that of relations and the emergence of entities lasts longer and is more common than that of relations.For instance, ICEWS14 has 7128 entities and 230 relations; the accumulated number of relation types in ICEWS14 increases rapidly to 115, half of the total number, in 10 days, on the contrary, the accumulated number of entities is steadily increasing over the entire time period of ICEWS14.Therefore, we study the more severe case of emerging entities and leave the research for emerging relations in lifelong TKG reasoning to future works.

Figure 1 :
Figure 1: A sample sequence of three TKG snapshots in lifelong TKG reasoning.New entities (blue nodes) and new facts (all edges) in each snapshot emerge with time.

Figure 2 :
Figure 2: Model overview.e i is candidate entity and δt i M is the corresponding temporal displacement (i = 1, 2, 3).

Table 1 :
Statistics of the constructed benchmarks.|E i |, |D i |, |T i | are numbers of entities, facts and timestamps in G i .

Table 3 :
The low BWT scores of L-TeRo show the inefficiency of Query : (Sudan, sign formal agreement, Ethiopia, 2014/12/25) Reasoning trajectories and confidence of corresponding temporal rules.Target entities are underlined.

Table 4 :
MRR results of ablation study for our model on the three new benchmarks.