HiSMatch: Historical Structure Matching based Temporal Knowledge Graph Reasoning

A Temporal Knowledge Graph (TKG) is a sequence of KGs with respective timestamps, which adopts quadruples in the form of (\emph{subject}, \emph{relation}, \emph{object}, \emph{timestamp}) to describe dynamic facts. TKG reasoning has facilitated many real-world applications via answering such queries as (\emph{query entity}, \emph{query relation}, \emph{?}, \emph{future timestamp}) about future. This is actually a matching task between a query and candidate entities based on their historical structures, which reflect behavioral trends of the entities at different timestamps. In addition, recent KGs provide background knowledge of all the entities, which is also helpful for the matching. Thus, in this paper, we propose the \textbf{Hi}storical \textbf{S}tructure \textbf{Match}ing (\textbf{HiSMatch}) model. It applies two structure encoders to capture the semantic information contained in the historical structures of the query and candidate entities. Besides, it adopts another encoder to integrate the background knowledge into the model. TKG reasoning experiments on six benchmark datasets demonstrate the significant improvement of the proposed HiSMatch model, with up to 5.6\% performance improvement in MRR, compared to the state-of-the-art baselines.


Introduction
Knowledge Graphs (KGs), which store facts as triples in the form of (subject, relation, object), have been widely applied to many NLP applications, such as question answering (Lan and Jiang, 2020), dialogue generation (He et al., 2017) and recommendation (Wang et al., 2019).However, facts may constantly change over time.Temporal Knowledge Graphs (TKGs) is a kind of KGs that describe such dynamic facts by extending each triple with a timestamp as (subject, relation, object, timestamp).Usually, a TKG is represented as a sequence of KG snapshots.The TKG reasoning task is to infer new facts from known ones, which primarily has two settings, interpolation and extrapolation.The former attempts to complete missing facts in history, while the latter aims to predict future facts with historical facts.This paper focuses on the extrapolation setting, which is more challenging and far from being solved (Jin et al., 2020).This task can be seen as answering the query about the future facts (e.g., (COVID-19, Infect, ?, 2022-8-1)) by selecting from all the candidate entities.
The key of answering the queries about future facts is to understand the history thoroughly.All the existing models conduct reasoning based on substructures extracted from the whole history.These substructures can be divided into two types, i.e., query-related history (Jin et al., 2019;Zhu et al., 2021) and candidate-related history (Li et al., 2021b(Li et al., , 2022;;Han et al., 2021a;Deng et al., 2020).The former contains the latest historical facts related to the subject and relation in the query, which reflects the behavioral trends of the subject concerning the query relation.The latter contains all the latest historical facts of the candidates without considering the query, which indicates the behavioral trends of all the entities.Both of these two kinds of history are vital to TKG reasoning.Take the query (COVID-19, Infect, ?, 2022-8-1) for example, the query-related history contains facts like (COVID-19, Infect, *, t), where t is before 2022-8-1.The candidate-related history of a candidate A, includes facts reflecting its own behaviors, like (A, *, *, t) or (*, *, A, t).In the realistic situation, the occurrence of the fact (COVID-19, Infect, A, 2022-8-1) is caused by the interactions between these two kinds of history.However, existing models only focus on one kind of history and underestimate the other, which limits their performance on TKG reasoning.Overall, it still remains a challenge to model both two kinds of history in a unified framework.
To reduce the computational cost caused by the enormous facts in history, these two kinds of history usually contain one hop facts of the centered entities.Thus, they cannot model the high-order associations among the entities, which is also vital to TKG reasoning.
Motivated by these, we consider both queryrelated history and candidate-related history under a matching framework and propose the Historical Structure Matching (HiSMatch) model.Specifically, it applies two structure encoders to model the semantic information in the above two kinds of historical structures, respectively.Then, it obtains the matching scores.Both of these two structure encoders contain three components: (1) a structure semantic component to model the structure dependencies among concurrent facts at the same timestamp; (2) a time semantic component to model the time numerical information of the historical facts; (3) a sequential pattern component to mine the behavioral trends from the temporal order information.Additionally, to model the high-order associations among the entities, we consider the most recent KGs as the background knowledge of each query and apply a GCN-based background knowledge encoder to obtain more informative entity representations for the two structure encoders.
Our contributions are summarized as follows: • We first advocate the importance of modeling both query-related and candidate-related history for TKG reasoning and transform the task into a matching problem between them.
• To solve this problem, we propose HiSMatch to comprehensively capture the information in both historical structures via modeling the structure dependencies among concurrent facts, the time numerical information of historical facts and the temporal order among facts.HiSMatch complementally captures high-order associations among entities by modeling the recent background knowledge.
• Extensive experiments on six commonly used benchmarks demonstrate that HiSMatch achieves significantly better performance (up to 5.6% improvement in MRR) on the TKG reasoning task.

Related Work
TKG Reasoning under the interpolation setting focuses on completing the missing facts at past timestamps (Liao et al., 2021;Goel et al., 2020;Wu et al., 2020;Han et al., 2020a;Jiang et al., 2016;Dasgupta et al., 2018;Garcia-Duran et al., 2018;Xu et al., 2021).For example, TTransE (Leblay and Chekol, 2018) extends the idea of TransE (Bordes et al., 2013) by adding the temporal order constraints among facts.Also, HyTE (Dasgupta et al., 2018) projects the entities and relations to time-related hyperplanes to generate time-aware representations.TNTCom-plEx (Lacroix et al., 2020) performs 4th-order tensor factorization to get the time-aware representations of entities.However, they cannot obtain the representations of the unseen timestamps and are not suitable for the extrapolation setting.TKG Reasoning under the extrapolation setting aims to predict facts at future timestamps.According to the historical structure the models focus on, the existing models can be categorized into two groups: query-based and candidate-based models.
Query-based models focus on modeling the query-related history.For example, RE-NET (Jin et al., 2020) models the query-related subgraph sequence.GHNN (Han et al., 2020c) introduces the temporal point process to model the precise time information and takes the 1-hop subgraphs of the query entity into consideration.CyGNet (Zhu et al., 2021) captures repetitive patterns by modeling repetitive facts with the same subject and relation to the query.xERTE (Han et al., 2020b) learns a dynamic pruning procedure to find the queryrelated subgraphs.CluSTeR (Li et al., 2021a) and TITer (Sun et al., 2021) both adopt reinforcement learning to discover query-related paths in history.
Candidate-based models encode the latest historical facts of all the candidate entities without considering the query, and query are considered only in the decoding phase.RE-GCN and its extension CEN (Li et al., 2021b(Li et al., , 2022) designs an evolutional model to get the representations of all the candidates by modeling history at a few latest timestamps.TANGO (Han et al., 2021a) utilizes neural ordinary differential equations to model the structure information for each candidate entity.Glean (Deng et al., 2020) introduces unstructured textual information to enrich the candidate-related history.

Background Knowledge Encoder
M d E X f 3 P 5 S l a I M C X E a 9 y k u C T P j n P T Z N p 7 U 1 K 5 7 6 5 n 4 m 1 F q V u 9 Z r s 3 w r m 9 J A 3 Z / j n M a N P e r 7 m H V v T i o 1 M 7 z U R e x h W 3 s 0 j y P U M M p 6 m i Y K h / x h G f r z J L W y B p / S q 1 C 7 t n E t 2 U 9 f A C Z r Z I A < / l a t e x i t > tq ti r r < l a t e x i t s h a 1 _ b a s e 6 4 = " 4 f 1 / h e Y S 1 1 7 o M Z / N n 0 7 a 2 9 1 s R y k = " > A A A C 3 H i c j V H L S s N A F D 2 N r 1 p f V R c u 3 A S L 6 E J K I q I u C 2 5 c S Q X 7 g L 5 I 0 m k d z I t k I p T Q n T t x 6 w + 4 1 e 8 R / 0 D / w j t j B L W I T k h y 5 t x 7 z t w 7 1 w 5 d H g v D e M l p U 9 M z s 3 P 5 x P e + k Z K 7 V D p 7 j 0 R q T U s U 2 a g P I i w v I 0 X c U T 5 S z Z 3 7 x T 5 S l r G 9 H f z r w 8 Y g U u i f 1 L 9 5 n 5 X 5 3 s R W C A Y 9 U D p 5 5 C x c j u n M w l U b c i K 9 e / d C X I I S R O 4 j 7 F I 8 K O U n 7 e s 6 4 0 s e p d 3 q 2 l 4 q 8 q U 7 J y 7 2 S 5 C d 5 k l T R g 8 + c 4 J 0 F 9 v 2 w e l s 3 z g 1 L l L B t 1 H p v Y w i 7 N 8 w g V n K K K m q r / A Y 9 4 0 r r a j X a r 3 X 2 k a r l M s 4 5 v S 7 t / B 3 k P m V 4 = < / l a t e x i t > g e,l t 0 i < l a t e x i t s h a 1 _ b a s e 6 4 = " G c 7 b / C N V e a 5 8 0 6 W Z 2 b n 5 h f L i U i M J 0 9 h l d T f 0 w r j l 2 A n z e M D q g g u P t a K Y 2 b 7 j s a Z z e S j j z S s W J z w M T s V 1 x N q + 3 Q t 4 l 7 u 2 I M o q r 5 z 7 t u g 7 3 a w 3 T 6 g r S V J p 2 0 w L 5 O J W E M X 7 s S t P + B W f 0 j 8 A / 0 L 7 4 w p q E V 0 Q p I z 5 9 5 z Z u 6 9 V u g 6 M d f 1 1 5 w y N T 0 z O 5 e f L y w s L i 2 v q K t r t T h I I p t x q ti < l a t e x i t s h a 1 _ b a s e 6 4 = " e + 5 5 l Y x n S I U F h T 4 L + s / o J f c / s y I o S j 9 e 9 y I + S c 5 e l 3 B c h r 0 s h f X 4 e J 5 w F r s / P 3 K t D F T / r 8 y Q u k u m u q J t b X 6 q S 5 B A T p 3 C b 4 g l h T y s / + 2 x p T a p r V 7 1 l O v 6 q M x W r 9 l 6 e m + F N 3 Z I G 7 P w c 5 z g 4 3 a k 4 e x X n Z L d U r e W j n s M 6 N l C m e e 6 j i i M c o 0 7 e A z z g E U / G h X F j 3 B p 3 H 6 n G R K 4 p 4 t s y 7 t 8 B A F C X P A = = < / l a t e x i t > v(tq ti) e S a v 5 M 1 6 s l 6 s d + t j 3 l q w 8 p l D 8 g f W 5 w 8 Z m 5 L x < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " 5 2 z V W b a + S B 4 z r S J c 2 7 3 a C l P s A W / 0 q 8 Q / 0 L 7 w z p q A W 0 Q l J z p x 7 z 5 m 5 9 9 q h 6 8 T C M F 4 z 2 t z 8 w u J S d j m 3 s r q 2 v p H f 3  Above all, none of the existing models focus on both two kinds of history in a unified framework.HiSMatch considers these two kinds of history under the matching framework and takes the advantages of both kinds of models.

Problem Formulation
A TKG G = {G 0 , ..., G t , ..., G T } is a sequence of KGs, each of which contains facts occurred at timestamp t, i.e., G t = {E, R, F t }, where E is the set of entities, R is the set of relations and F t is the set of facts that occurred at t.Each fact is a quadruple (e s , r, e o , t), where e s , e o ∈ E and r ∈ R. For each fact in TKG, we add the inverse quadruple (e o , r −1 , e s , t) into TKG, correspondingly.The TKG reasoning task aims to predict the missing object via answering a query q = (e q , r q , ?, t q ) with the historical KGs given.Note that, when predicting the missing subject of a query q = {?, r q , e q , t q }, we can convert the query into q = {e q , r −1 q , ?, t q }.

The HiSMatch model
HiSMatch aims to captures the semantic similarity contained in the query-related and candidaterelated historical structures.For each query time, it first embeds the background knowledge into the initial entity representations.With the initial representations as input, it maps the semantic information in these two historical structures into the vectorized representations of structures.Based on the structure representations, matching scores are calculated.Thus, as shown in Figure 1, HiSMatch consists of four parts: the query structure encoder, the candidate structure encoder, the background knowledge encoder, and the matching function.First, two kinds of historical structures and a background knowledge graph are derived from the TKG.Then, the background knowledge encoder gets the representations of the entities with the background knowledge graph as input (Section 4.3).With the learned representations as input, two structure encoders use three components to integrate three kinds of semantic information into the representations of query-related structure and candidate-related structure, respectively (Section 4.1 and 4.2).Finally, the matching function calculates the scores between the query and candidates based on the representations of their historical structures (Section 4.4).

Query Structure Encoder
The query-related historical structure should reflect the behavioral trends of the query.Motivated by this, for a query q = (e q , r q , ?, t q ), the queryrelated historical structure consists of the latest historical facts with the same subject e q and relation r q .These facts co-occuring at the same timestamp t form a subgraph g q t centered on e q .Then, we obtain a subgraph sequence {g q t 1 , ..., g q t i , ..., g q tm }, where t 1 < ... < t i < ... < t m < t q and m is the maximum length of the sequence.
Three kinds of information are vital in the above historical structure, namely, the structure semantic information of each subgraph, the time numerical information of each subgraph, and the temporal order information across subgraphs.To model these three kinds of information, we design three components as follows: Structure Semantic Component.The structure semantic information captures the associations among the query entity and other entities through the query relation and implies possible answer entities.Since all the concurrent facts, which having the same subject and relation to the query, form a one-hop homogeneous graph, we simply perform mean pooling over all the neighbor entities in each subgraph g q t i to get the structure semantic representation g q t i of the subgraph, where N q t i is the set of the neighborhood of the query entity e q in g q t i and e is the representation of each entity e calculated by the background knowledge encoder (see Section 4.3).
Time Semantic Component.Previous works (Jin et al., 2019(Jin et al., , 2020) ) only consider the temporal order of the facts but ignore their time numerical information.A much earlier fact and a recent one contribute equally when they have the same order in the subgraph sequence.Actually, the recent fact is more important.Motivated by this, we model the time numerical information by encoding the time interval d = t q − t i , into the time representation v(d).However, giving each time interval a learnable time representation always meets the time sparsity problem (i.e., the time interval used in the test phase may not exist in the training phase).Thus, we model any time interval by rescaling a learnable time unit w t with a time bias b t , (2) Since some facts occur periodically, such as elections, we additionally apply the periodic activation function, i.e., cosine function, on v(d).
Sequential Pattern Component.Furthermore, the temporal order information in the subgraph sequence implies sequential patterns of the query entity.To integrate the sequential patterns into the representation of the query, we use Gated Recurrent Unit (GRU) to model the subgraph sequence.First, for every timestamp t i (i = 1, 2, ..., m), we concatenate structure semantic representation and the time semantic representation from the above two components as the input of GRU, Then these representations {x q t 1 , ..., x q t i , ..., x q tm } are fed into GRU recursively, where i ∈ {1, 2, ..., m} and h q t 0 is the randomly initialized hidden representations for GRU.The final representation of query (e q , r q , ?, t q ) is the ouput of GRU at the final step, i.e., h q tq = h q tm .

Candidate Structure Encoder
The candidate-related historical structure reflects the behavioral trends of each candidate entity.For each candidate entity e, we use its 1-hop subgraphs at latest historical timestamps to form a subgraph sequence n }. n is the maximum length of the sequence.Actually, this structure is similar to the query-related historical structure, the difference is that each subgraph in the sequence is multi-relational.Therefore, we use an encoder similar to the query structure encoder, except the calculation of the structure semantic representation of each subgraph.More specifically, we adopt the CompGCN (Vashishth et al., 2019) instead of the mean pooling operation, to capture the semantic information of different relations2 .The representation of the candidate entity e at timestamp t i , is calculated by a CompGCN with ω 1 layers.Thus, the representation of the (l+1)-th layer is where r is the representation of relation r; h e ′ ,l t ′ i denotes the l-th layer representation of entity e ′ at t ′ i timestamp; W l 1 , W l 2 are the weight matrices of the l-th layer; c e is a normalization constant, which equals to the in-degree of entity e.Note that the input representations of all entities are also calculated by the background knowledge encoder, which will be introduced in Section 4.3.
Then, the structure semantic representation of the subgraph g e t ′ i , i.e., g e t ′ i , equals to the representation of the centered candidate e from the last layer of CompGCN, i.e., g e Similar to the query structure encoder, we use another GRU to model the subgraph sequence.The input of GRU at timestamp where d = t q − t ′ i and v(d) is the time interval representation calculated by a shared time semantic component introduced in Section 4.1.Finally, the output of GRU at the last step t ′ n is used as the representation of candidate entity e at t q , i.e., h e tq = h e t ′ n .

Background Knowledge Encoder
The above two historical structures are local information centered on the query entity or the candidate entity, which focus on describing the behavioral trends of the entities.However, these two kinds of structures may miss some important entities that have high-order associations with the query entity or the candidate entity in the whole TKG.Since the recent history is more important, for query timestamp t q , we gather the latest k KGs into a cumulative graph G tq , called background knowledge graph.Formally, G tq = {E, R, Ftq = {(e s , r, e o )|(e s , r, e o , t) ∈ F t , t q − k ≤ t < t q }}, where Ftq is a set of facts.We adopt another CompGCN with ω 2 layers to model it since it is also a multi-relational graph.The representations of all entities are calculated as follows, where E ′ is the randomly initialized entity representation matrix and E is used as the input entity representation matrix of the aforementioned two structure semantic components.R is the relation representation matrix, which is shared with the structure semantic component.For the entities that have no facts in the background knowledge graph, an self-loop operation is conduct to get its representation.Note that, the background knowledge graph changes along the query time and E is different for different t q .

Matching Function
With the representation h q tq of the query and the representation h e tq of each candidate entity e at timestamp t q as input, the matching function calculates the score of the quadruple (e q , r q , e, t q ).As previous work (Vashishth et al., 2019;Li et al., 2021b) shows the convolutional score functions get good performance on reasoning tasks, ConvTransE (Shang et al., 2019) is chosen as the matching fucntion, which contains 1D convolution and fully-connected layers, denoted by ConvT ransE(•).To describe the behavioral information of the query entity in the query representation, an sum-up operation is performed between h q tq and h eq tq .Then, the score for eacht candidate entity e is calculated as follows: ϕ(e q ,r q , e, t q ) = σ(h e tq ConvT ransE(h q tq + h eq tq , r q )), where σ(•) is the sigmoid function.

Training Details
The training objective is to minimize the crossentropy loss: where T is the number of timestamps in the training set; y e t = 1 if e equals to e o , otherwise 0; ϕ(e s , r, e, t) is the matching score between the query (e s , r, ?, t) and the candidate entity e; Θ are all the model parameters.

Experiments
We compare HiSMatch with a number of baselines on six datasets to validate its effectiveness.In addition, we conduct ablation study to analyze the importance of its different parts.We also evaluated the effects of different kinds of GCN layers in the candidate structure encoder and the background knowledge encoder.Besides, we study the maximum time interval that HiSMatch models.

Datasets
To evaluate the effectiveness of HiSMatch, we use the following six benchmark TKGs: ICEWS14 (Li et al., 2021b), ICEWS14* (Han et al., 2020b), ICEWS18 (Jin et al., 2020), ICEWS05-15 (Li et al., 2021b), GDELT (Jin et al., 2020) and WIKI (Leblay and Chekol, 2018) were extracted from the large-scale event-based database, Integrated Crisis Early Warning System.The ICEWS14, ICEWS14* and ICEWS18 datasets contain events in 2014 and 2018, respectively, and the ICEWS05-15 dataset contains events occurred from 2005 to 2015.GDELT is extracted from the global database of events, language, and tone (Leetaru and Schrodt, 2013), which has a fine-grained time granularity of 15 minutes.WIKI is a TKG with the largest time granularity of one year.The statistics of the datasets are listed in Table 1.

Evaluation Metrics
We employ widely used Hits@N and Mean Reciprocal Rank (MRR) to evaluate the performance of the models.Hits@N measures the proportion of correct entities whose scores rank less than or equal to N .In this paper, N ∈ {1, 3, 10}, i.e., the results in terms of Hits@1, Hits@3, and Hits@10 are reported.MRR measures the average of these reciprocal ranks and is the most typical metric for TKG reasoning.Previous work (Han et al., 2020b(Han et al., , 2021a;;Li et al., 2021a,b) points out that the traditional filtered setting is flawed as it ignores the time of the fact and filters all facts with the same entity and relation before ranking.Actually, only the facts occurring at the same time should be filtered.Thus, we calculate the results under the more reasonable time-aware filtered setting following Sun et al. (2021); Han et al. (2021a), which only filters out the quadruples occurring at the query time.

Implementation Details
The dimensions of the entities and relations are set to 128, and the dimension of the time semantic representation is set to 32 for all the datasets.For the structure semantic encoders, the optimal lengths of historical structures of query m and candidate entities n are equal in this paper.For ICEWS14, ICEWS18, ICEWS05-15 and GDELT, they are set to 5; while 6 for ICEWS14* and 1 for WIKI; the number of layers ω 1 of the CompGCN is set to 1 for GDELT and 2 for the other datasets; the GRU layers is set to 1 for all the datasets and the output dimension of the GRU unit is set to 128.For the background knowledge encoder, the latest KG number k is experimentally set to 4, 1, 2, 1, 2, 2 for ICEWS14, ICEWS18, GDELT, ICEWS14*, ICEWS05-15, and WIKI, respectively; we set the dropout rate for each layer to 0.2 and the layer of CompGCN in the background knowledge encoder, ω 2 , to 2, for all the datasets.For the matching function, the number of kernels is set to 50, the kernel size is set to 2×3, and the dropout rate is set to 0.2, for all the datasets.Adam (Kingma and Ba, 2014) is adopted for parameter learning with the learning rate 0.001.All the experiments are carried out on 32G Tesla V100.

Experimental Results
The experimental results of HiSMatch and all the baselines on TKG reasoning are presented in Tables 2 and 3.It can be seen that HiSMatch consis-  tently outperforms all the baselines on all the six TKGs, which indicates its effectiveness and superiority.Especially on ICEWS14 and ICEWS05-15, HiSMatch achieves the most significant improvements of 5.6% and 4.8% in MRR, respectively.In more detail, we have the following observations: (1) HiSMatch outperforms all the KG reasoning models because it can capture both the time information for each fact and sequential patterns in TKGs; (2) HiSMatch performs much better than those interpolation models because they cannot learn representations for unseen timestamps; (3) More importantly, HiSMatch gets better results than all the extrapolation baselines, which proves the superiority of modeling both two kinds of history, i.e., query-related history and candidaterelated history; (4) It can be seen that the baselines (e.g., TITer) focusing on query-related his-tory are usually strong on precision and get good results on Hits@1 while the baselines focus on the candidate-related history (e.g., RE-GCN) are more capable on recall and get good results on Hits@10.In a word, by transforming the TKG reasoning task into a matching task, HiSMatch utilizes both two kinds of history more comprehensively.Moreover, HiSMatch captures more highorder associations via the background knowledge graph.Therefore, it gets the best performances in all the metrics.
By conducting experiments on six datasets with different time granularities, we found that the time granularity partly determines what is vital to the TKG reasoning task.Take the two most typical datasets for example, (1) GDELT has the most finegrained time granularity (15 minutes) and the results of all the baselines are similarly poor, com-pared with those of the other datasets.There are more timestamps in history when the time granularity gets more fine-grained, which requires the model to capture history at more timestamps.Under the matching framework, HiSMatch can capture longer historical information than candidatebased models and more comprehensive history than query-based models.Thus, it gets better results (2.3% in MRR); (2) Contrary to GDELT, WIKI has the largest time granularity (1 year).In this situation, the behavioral trends implied in history at fewer timestamps are vital for the reasoning.Moreover, there are more structural dependencies in each KG due to the large time granularity.Thus, RE-GCN focuses on modeling the global structure at the latest a few timestamps and gets strong performance on this data.Still, HiS-Match outperforms it by modeling the two kinds of substructures and the background knowledge.

Ablation Study
To further analyze how each part of HiSMatch contributes to the final results, we report the MRR results of the HiSMatch variants on the validation sets on three typical datasets, namely, ICEWS14, ICEWS18 and WIKI, in Table 4.
Impact of the Query Structure Encoder.To demonstrate how the query structure encoder contributes to the final results of HiSMatch, we remove the query structure encoder and use the representation of the query entity from the candidate structure encoder as the representation of the query.The results are denoted as -query in Table 4.It can be seen that -query performs consistently worse than HiSMatch on all the datasets.It is because that query-related historical structure can model the query more accurately by modeling the repetitive facts focused on the query relation.
Impact of the Candidate Structure Encoder.The results denoted as -candidate in Table 4 demonstrate the performance of HiSMatch without modeling the candidate-related history.More specifically, we directly add a fully connection layer after the query structure encoder to get the scores of all entities following (Jin et al., 2020).It can be observed that ignoring the candidaterelated historical structure has a great impact on the results.Candidate-related history contains rich information that describes the behavioral trends about all candidate entities, which is helpful to select the correct answer.Especially on WIKI, the   4 denotes a variant of HiSMatch that uses the learned representations of entities without the background knowledge encoder.Note that, in the training phase, the randomly initialized representations of entities will be learned and updated.In the test phase, the model uses the learned representations of entities as the input.It can be observed that the performance of -background is worse than HiSMatch on all the datasets, especially on the WIKI, which has a time granularity of one year.There are more high order associations in the background knowledge graph.Thus, the background knowledge is more important for WIKI than other datesets.
Impact of the Time Semantic Component.To demonstrate how the time semantic component contributes to the final results, we remove the time semantic component and only use the outputs of structure semantic component as the inputs of the sequential pattern component.The results are denoted as -time in Table 4.It can be seen that the time semantic component is useful on all the datasets.It is because the time semantic component describes the time numerical information so that it can help HisMatch to distinguish different time intervals between the history and the query.

Comparative Study on Different GCNs
To further study the impact of different kinds of GCNs in the candidate structure encoder and the background knowledge encoder, we replace CompGCN in these two encoders with CompGCN-mult (Vashishth et al., 2019), RGCN (Schlichtkrull et al., 2018) and KBAT (Nathani et al., 2019).The MRR results on the validation sets of ICE14, ICE18, and WIKI are reported in Table 5.It can be seen that HiSMatch (CompGCN) gets the best performance.For ICE14 and ICE18, the two datasets with the time granularities of one day, the structure dependencies are relatively simple.Thus, different GCNs get similar performances.While for WIKI, the dateset with the time granularities of one year, there are more structural dependencies in the candidate-related history and the background knowledge graph.Therefore, the performance gap caused by the capabilities of GCNs becomes more significant.

Study on Maximum Time Interval
To explore the maximum time intervals between the query time and the history that HiSMatch models, we conduct statistics on the maximum time interval of historical facts in the query-related histoical structure (∆t = t q − t 1 ) and candidate-related historical structure (∆t ′ = t q − t ′ 1 ) under the optimal parameters (We report the average maximum time interval on the validation sets).We also report the time interval of background knowledge graphs (k) for comparison.As shown in Figure 2, ∆t and ∆t ′ are both much larger than k.The results demonstrate that the two historical structures can modeling the long-term behaviors of the query and candidates.The background knowledge graph focuses on model high-order associations among all the facts at the latest a few timestamps, which models the global structure dependencies in a much shorter time interval.It can be seen that ∆t on GDELT is more than 800 and the value is only around 16 on WIKI, which verifies the discussion in Section 5.2.

Conclusion
In this paper, we considered both two kinds of history, namely, the query-related history and the candidate-related history in TKG reasoning and transformed the task into a matching problem between them for the first time.We further proposed the HiSMatch model, which applies two structure encoders to calculate the representations of historical structures of the queries and candidates, respectively.Each encoder contains a structure semantic component to model the concurrent structure among entities, a time semantic component to model the time numerical information of facts, and a sequence pattern component to capture the temporal orders.Besides, HiSMatch integrates the background knowledge into the representations of entities.Experimental results on six benchmark datasets demonstrate the superiority of HiSMatch.

Limitations
The limitations of this work can be concluded into two points: (1) HiSMatch uses a heuristic history finding strategy to get two kinds of history, which may lose some critical facts.Although it uses the background knowledge encoder to consider more historical facts, a learnable history finding strategy is more helpful.(2) HiSMatch is an initial attempt to apply the matching framework to solve the TKG reasoning task using two separate encoders for each kind of history, which fails to model the interactions between the two kinds of history explicitly.Designing a cross-encoder to match history more comprehensively is a good direction for future studies.
r F o z c m p Y Y j 0 o S U F x M W p 2 k y n k p n w f 7 m n U l P c b c p / e 3 c y y e W Y 0 T s X 7 p 5 5 n 9 1 o h a O A c 5 l D o Z G 9 J e e S Z a z e g U n 1 5 B S h P b p I k o T x B W p 5 k 6 n m p n x f 7 m P d S e 6 m 6 3 9 H c z r 4 B

Figure 2 :
Figure 2: Statistic of maximum time intervals in history on four datasets.

Table 1 :
. The first four datasets with the time granularity of 24 hours Statistics of the datasets (|E train |, |E valid |, |E test | are the sizes of training, validation, and test sets.).

Table 4 :
MRR results (in percentage) by different variants of HiSMatch on three datasets.datasetwiththe largest time granularity as mentioned in Section 5.2, entities have more associations among each other at each timestamp and thus contain rich behaviors.Impact of the Background Knowledge Encoder.-background in Table

Table 5 :
Performance (in percentage) of HisMatch with different kinds of GCNs.