Syntactically Robust Training on Partially-Observed Data for Open Information Extraction

Open Information Extraction models have shown promising results with sufficient supervision. However, these models face a fundamental challenge that the syntactic distribution of training data is partially observable in comparison to the real world. In this paper, we propose a syntactically robust training framework that enables models to be trained on a syntactic-abundant distribution based on diverse paraphrase generation. To tackle the intrinsic problem of knowledge deformation of paraphrasing, two algorithms based on semantic similarity matching and syntactic tree walking are used to restore the expressionally transformed knowledge. The training framework can be generally applied to other syntactic partial observable domains. Based on the proposed framework, we build a new evaluation set called CaRB-AutoPara, a syntactically diverse dataset consistent with the real-world setting for validating the robustness of the models. Experiments including a thorough analysis show that the performance of the model degrades with the increase of the difference in syntactic distribution, while our framework gives a robust boundary. The source code is publicly available at https://github.com/qijimrc/RobustOIE.


Introduction
Open Information Extraction (OpenIE) involves converting natural text to a set of n-ary structured tuples of the form (arg 1 , predicate, arg 2 , ..., arg n ), composed of a single predicate as well n arguments.With the advantages of domain independence and scalability, OpenIE serves as a backbone in natural language understanding and fosters many applications such as text summarization (Fan et al., 2019) and question answering (Yan et al., 2018).
Tremendous efforts have been devoted to build models that can better fit the extractions from texts (Michele et al., 2007;Angeli et al., 2015; * Corresponding author: xubin@tsinghua.edu.cnSaha and Mausam, 2018;Kolluru et al., 2020b;Yu et al., 2021).However, a major issue remaining in OpenIE is the syntactic partial observability -the syntactic distribution on the existing training set is only based on partial observations, and it is far from covering the entire syntactic hypothesis space in the real world.This issue creates a challenge that the models rely heavily on the syntactic forms during training, and degrade significantly when the syntactic distribution changes in the real world.
An evaluation is shown in Figure 1.We cluster the CaRB (Bhardwaj et al., 2019) samples based on the HW-Syntactic Distance (introduced in Sec.3.5), which is an effective metric that measures the syntactic difference between two sentences, and evaluate the state-of-the-art model trained on the Ope-nIE4 data (Kolluru et al., 2020b) on them.A frustrating result shows that the model performance exhibits a significant degradation as the syntactic similarity between the training set and clustering centers of subsets decreases.The biased performance comes from the inconsistency of the syn-tactic distributions among data.For example, in Figure 1, the model achieves a depressing F1 score of 0.47 on the subset 5 with the lowest average syntactic similarity to the training set.Therefore, to build robust OpenIE systems, we need to train the models on a sufficient syntactic distribution.
However, it is not trivial to obtain data that are both diverse and accurate to satisfy the distribution assumption.First, it is extremely expensive and almost impossible for human annotators to provide a large corpus with diverse syntactic expressions.Second, existing distant supervision-based methods are not applicable to OpenIE due to the uncertainties of both the type and form of arguments and predicates.
Humans learn syntactic grammar by paraphrasing the same meaning into different expressions.For example, the following two sentences convey the same meaning in different syntactic forms.The diverse paraphrases of normal-scale training data can guarantee sufficient syntactic distribution.However, an intrinsic problem that hinders the efficiency of this approach is the Knowledge Deformation.In the following example, it is difficult to reveal the source object Earth in the target paraphrase b as it has been transformed into the form of the name of the planet with different syntax.
• a.After five years of searching, the Colonials found a new world and named it Earth.
• b.The colonials searched for five years until they discovered a new world and gave him the name of the planet.
In this paper, we propose a syntactically robust training framework that enables OpenIE models to be trained on a syntactic-abundant distribution based on the diverse paraphrase generation.Specifically, we first generate a large-scale syntactically diverse paraphrase candidates set for the training data based on an off-the-shelf paraphrase generator.Then, we propose two adaptive algorithms to recover the deformed arguments of the original knowledge, a semantic similarity-based matching method to locate the disordered arguments and a syntactic tree walking-based method to complete the consecutive spans.We further employ the generative T5 (Raffel et al., 2020) model to restore the deformed predicates as there are potential tense and voice changes in the target paraphrase.Finally, a simple but effective denoising method is utilized to prevent the impact of false positives in training.
To exhaustively validate the syntactic robustness of OpenIE models in the real-world setting, an additional evaluation set including diverse paraphrases and knowledge triples has been built on the basis of CaRB.We conduct experiments on the standard and our proposed evaluation sets based on the division of different syntactic categories, and a comprehensive analysis shows that the model performance decreases with increasing the difference in the syntactic distributions, while our training framework gives a robust boundary.

Syntactically Robust Training
Framework for OpenIE

Overview
The task of OpenIE aims to build a model p θ to automatically extract a set of n-ary tuples {r i = (a 1 , p r , a 2 , a 3 , ..., a n )} m i=1 for each sentence, where p r indicates the predicate, a 1 , a 2 indicate the subject and object, and a 3 , ..., a n refer to the other argments such as time and location.Given a training set D = (s 1 , s 2 , ..., s |D| ) consisting of sentences samples, where each sentence exhibits a syntactic structure e s .Our goal is to maximize the expectation of log-likelihood function log p θ (r 1 , ..., r m , e s |s) with respect to the data distribution p D as following: where different OpenIE models may adopt a distinct strategy to model the probability p θ , such as the triples generating paradigm (Kolluru et al., 2020a) or sequence labeling paradigm (Zhan and Zhao, 2020), and the maximization process is performed by gradient ascent.
The syntactic distribution in training set e s ∼ p D is far from covering the entire syntactic hypothesis space, and plays a fatal role in OpenIE modeling.In this research, we aim to expand the training with a sufficient syntactic distribution.The proposed framework is illustrated in Figure 2. We first generate a syntactically diverse paraphrase candidate set for the training data with an off-the-shelf paraphrase generation model.Then, we restore the deformed arguments using semantic similarity-based matching and syntactic tree walking algorithms, followed by a T5-based predicate restoration.Finally, a denoised training is adopted to optimize the model on the sufficient distribution.
After five years of searching, the Colonials found a new world and named it Earth.

Paraphrase Generation
To create syntactically diverse paraphrases candidates set on D, we adopt AESOP (Sun et al., 2021), a syntactically controllable paraphrase generation model as our generator.As can be seen in Figure 2, by utilizing the BART (Lewis et al., 2020) model as a backbone, the model takes source sen-tence<sep>source full syntactic parse<sep>target syntactic parse as the input sequence, and outputs a sequence of the form target syntactic parse<sep>paraphrase in which the generated paraphrase conforms with the pruned target syntax.
The AESOP model used in our work is trained on a parallel annotated data with a two-level target syntactic tree.During generation given the training set D, we first get their constituency parse trees1 {T D s 1 , ..., T D s |D| } and linearize them into parentheses trees as the source full syntactic parses (A part is shown in Figure 2).Then, we collect a set of constituency parse pairs pruned at height 3 {(T P s 1 , T P t 1 ), ..., (T P s |P| , T P t |P| )} from the ParaNMT-50M (Wieting and Gimpel, 2018) and count their frequencies.For each sentence in D, following the original work we obtain m most similar parses {T P s 1 , ..., T P sm } by calculating weighted ROUGE scores between parse strings, and select k topranked parses from {T P t 1 , ..., T P t|P| } for each T P s i by a sampling with the distribution of: where #(T P s i , T P t ) refers to the count of occur-rence in the statistic data.In the end, we generate k paraphrases for each sentence in D. For a tradeoff of quality and quantity, we set k and m to 5 and 2, respectively.As a result, we get the paraphrases candidates set P, which is roughly five times the size of sentences in training set D.

Knowledge Restoration
As the paraphrases change the expression form of the original sentence, we need to recover the knowledge of transformed triples.The difficulty of knowledge restoration lies in two aspects: first, the OpenIE arguments are generally formed as a large span of words, which can be rearranged and rephrased in the target sentence.Second, the syntactic changes also lead to a transformation of tense or voice of verbs in the predicates.For example in Figure 2, the argument the Earth changes its expression and length to become the name of the planet, and the predicate were searching changes its tense to become searched for.Therefore, we first locate the arguments with the contextualized semantic matching and complete it with syntactic tree walking.Then for each pair of recovered arguments, we restore the corresponding predicate with the T5 model (Raffel et al., 2020).

Argument Restoration
As the expressional transformations, it is difficult to get the corresponding arguments in the target paraphrase sentence based on methods like pattern matching.Therefore, we utilize the semantic similarity with BERT (Devlin et al., 2019) to locate the arguments.We first compute the embeddings h s ∈ R |s|×d and h t ∈ R |t|×d for the source sentence s and target paraphrase sentence t, respectively.Then, for a triple (a s 1 , p s r , a s 2 ) where a s i → (l s i , r s i ), p s r → (l s p , r s p )2 in the source sentence, we calculate the semantic similarity scores c a i , c r ∈ R |t| by summing the cosine similarities between each word in a s i , p s r and target words of t: Next, we merge the consecutive indices of target words whose semantic similarity scores are greater than a threshold τ to get the resulting candidate spans {(l t i1 , r t i1 ), ..., (l t im , r t im )} and {(l t p1 , r t p1 ), ..., (l t pm , r t pm )} for a s i and p s r , and the final triplets are obtained by selecting a set of spans with the highest total score and no overlap.By applying this algorithm on P, we get dataset D P .We refer to the set expanded with this newly built set as Though the resulting spans based on semantic similarity matching are accurate in position, we find it incomplete due to the fact that words such as prepositions or adverbs can not be matched effectively by the contextualized embedding.On the other hand, a subtree with NP, QP or NX as the root in the constituency parses represents a continuous phrase fragment.Therefore, we propose to use the syntactic tree walking to further complete the target arguments.Specifically, for each word in span (l t ij , r t ij ), we perform a post-order traversal for the target syntactic tree to effectively find the subtree with NP, QP or NX as the root and containing the the word as a node.We obtain the refined span (l t ij , r t ij ) by replacing the original span (if it covers the original span, otherwise the original span is retained) with the corresponding words of the subtree.Finally, we select the optimal target spans {(l t * 1 , r t * 1 ), ..., (l t * n , r t * n )} of all arguments from the refined spans set of each argument by a simple optimality criterion that maintains n spans with the highest similarity without overlaps.We retain the argument restoration as Algorithm 1 in detailed.

Predicate Restoration
As the paraphrase may change the voice and tense of the predicate in the original sentence, it is not applicable to recover the predicate using the same algorithm as the arguments restoration.We adopt the Algorithm 1 Arguments Restoration 1 , a t 2 , ..., a t n ) 1: get target constituency parse tree T t 2: subtree roots T = {N P, QP, N X} 3: threshold τ = 0.7 4: for each argument a s i ∈ (a s 1 , ..., a s n ) do get candidate spans csp i = {(l t i1 , r t i1 ), ...} by merging the consecutive indices with values greater than τ in c a i 7: for tok k ∈ sp ij do 9: traverse(T t ) to find subtree T t k that satisfies: end for 12: end for 14: .., n} with highest score without overlaps T5 model (Raffel et al., 2020) to restore the predicate in the target paraphrase sentence, as there are a lot of predicates that can not be found from the continuous span of the original sentence.Specifically, we build a new dataset on D with the same corpus size.For each data sample in the new dataset, the input is of the form of source sentence, argument 1 , argument 2 <\s>, and the output is a generated sequence referring to the predicate.We train the basic T5 model on the new dataset.Then, we restore the predicate for each pair of arguments obtained from the algorithm 1 to get a final refined set D P .We refer to the refined final expanded set as D Ψ = D ∪ D P .

Denoised Training
During the training, we aim to maximize the expectation of log-likelihood function with respect to the data distribution: where p d refers to a training set, and p θ is a neural network model with learnable parameters θ, which either employs the sequence labeling paradigm to predict classification labels on the input sequence, or leverages the generative paradigm to generate target triples each token at a time.In this paper, we validate our proposed training framework on IMOJIE (Kolluru et al., 2020b), a strong generative model that predicts triples conditioned on the previous generation.
As the rephrasing in large argument spans may introduce false-positive word noises, we employ a simple but effective masking strategy to ignore the impact of negative words while retaining the contribution of valuable correct words in the span.For a triple (a 1 , p r , a 2 ), we calculate the importance of each word in an argument a i based on its semantic matching score obtained from the arguments restoration algorithm.For those words which are recovered from the syntactic tree, we set them to the average value of other words.We finally normalize the reciprocals of these importance scores and randomly select 15% of all words according to the probabilities distribution.These sampled words will be masked to not calculate their gradients in training.Note that we only mask the words in arguments as the predicate is short and less noisy.

Experiment
This work proposes a syntactically robust training framework including two knowledge restoration strategies.Therefore, our experiments are intended to demonstrate the effectiveness as well as the robustness of the proposed framework on test sets.

Datasets
We use the standard training set OpenIE4 (Kolluru et al., 2020b), and the constructed sets D Φ , D Ψ for model training.During evaluation, in addition to the benchmark dataset CaRB (Bhardwaj et al., 2019), we build a syntactically diverse evaluation set to validate the robustness of OpenIE model.We use the dataset OpenIE4 as the basic set D in our experiment, which is published by (Kolluru et al., 2020b) and prep-processed by (Kolluru et al., 2020a).The data is automatically built by running OpenIE-4, ClausIE, and RnnOIE on the sentences obtained from Wikipedia.

Training set
To estimate the quality of the generated samples of D Φ and D Ψ , we conduct fine-grained human verification by randomly samplling 100 data samples from each set.For a fair comparison, taking the triples from the human-annotated dataset CaRB as the reference criteria, we evaluate the generated samples on fact-level and span-level, respectively.Specifically, a triple is fact-level correct if all elements in the triple conform with the definition of arguments or predicate.A triple is span-level correct only if all arguments and predicates contain the complete words span in the sample sentence.The overall statistics are shown in Table 1.We can see that though the fact-level accuracy shows the useable for D Φ , the spans of arguments and predicate are extremely inaccurate with the accuracy of 34%.By further performing the algorithms of syntactic tree walking-based arguments restoration and predicate restoration, we improve both the fact-level and span-level accuracy to 91% and 71%, suggesting the satisfaction of the generated data.We use the standard benchmark CaRB (Bhardwaj et al., 2019) to evaluate the proposed framework, which is a high-quality crowdsourced dataset with 1282 sentences and each sentence has manually annotated about 4 n-tuples.

Evaluation set
In order to evaluate the syntactic robustness of OpenIE models, we build a syntactically diverse dataset based on CaRB with the proposed framework.We generate 5 paraphrases for each sentence from CaRB, and get 2269 high-quality sentences after performing the knowledge restoration.We refer to this automatically generated dataset as CaRB-AutoPara.The statistics of both datasets are shown in Table 2.We can see that the newly built dataset is twice as large in scale and the lengths of arguments and predicates conform with the CaRB.

Evaluation Metrics
We use the scoring system proposed by (Bhardwaj et al., 2019) to evaluate the OpenIE models on two test sets.The system first creates an all-pair matching table, with each column as a prediction tuple and each row as a gold tuple.It then computes single-match precision and multi-match recall by considering the number of common tokens in (gold, perdition) pair for each element of the fact.
Based on the confidence with each output triple, we report three important metrics: (1) Optimal F1: the largest F1 value in the P-R curve, (2) AUC: the area under the P-R curve, and (3) Last F1: the F1 score computed at the point of zero confidence.

Experimental Settings
We follow the original work to train a BART-based paraphrase model (Sun et al., 2021) on ParaNMTsmall (Chen et al., 2019), and the syntactic mapping set is collected from (Wieting and Gimpel, 2018).For knowledge restoration, we use the pretrained BERT (Devlin et al., 2019) model to calculate the cosine similarity, and fine-tune the T5 model (Raffel et al., 2020) with a language model head on it for the predicate restoration.The threshold τ and maintaining number of spans k are empirically set to 0.7 and 5, respectively.
We train two implementations of our proposed framework based on the baseline model IMO-JIE (Kolluru et al., 2020b) to investigate the effectiveness and syntactically robustness.IMOJIE Φ is trained on D Φ that adopts the semantic similarity matching as the knowledge restoration method only.IMOJIE Ψ is trained on D Ψ that uses the entire knowledge restoration algorithms.All models followed the original implementations by using BERT as encoder and LSTM with the CopyAttention mechanism (Cui et al., 2018a) as the decoder.The detained parameters setting are shown in Appendix A.

Results on Different Datasets
How does the proposed framework perform on the syntactic identically distributed data?
In comparison with the baseline model, we find that the proposed syntactically robust training framework generally enhances the OpenIE model to achieve better performance on identically distributed data.As shown in To investigate the effectiveness as well as syntactic robustness on open world setting, we evaluate models on the syntactically diverse set CaRB-AutoPara.We find that the proposed training framework comprehensively improves the syntactic robustness of the existing model, making it exhibit consistent better performance on no-identically distributed data.As shown in The proposed evaluation set CaRB-AutoPara is more challenging for OpenIE models that are trained on existing general datasets.The syntactic structures are varied with respect to the training set.By taking the same example mentioned above, there are sentences with a different voice and tense in the proposed CaRB-AutoPara, such as a question sentence Isn't it possible that he died of a heart attack?.

Analysis
We further explore the performance of the model on different subsets representing prototypical syntactic categories, and analyze the trend of the model effect as the syntactic differences between the training set and the subset changed.
How to effectively measure the syntactic difference between sentences?As the training data is massive, we need an efficient metric of the syntactic differences between sentences to divide the test set and calculate the syntactic distance between the training set and test set.
We propose a simple but effective syntactic distance algorithm called Hierarchical Weighted Syntactic Distance (HW-Syntactic Distance), to measure the differences.Intuitively, the more similar the skeleton of two sentences is, the less syntactic difference they have, i.e., the less syntactic distance.We use a hierarchical weighted matching strategy on the constituency parse trees to calculate the syntactic distance between two sentences.As shown in Figure 3, given two sentences with their constituency parse trees T 1 , T 2 prune at height 3, we first transform the tree nodes in T 1 , T 2 to se-quences q 1 , q 2 based on the level-order traversal.Then, we use the longest substring matching algorithm to accumulate the total matching length l tot of two sequences, where the length of i-th matched substring is multiplied by a sequentially discounting weight w i .The final distance is a normalized value based on the minimum sequence length of q 1 , q 2 , and its value domain is [0, 1].The detailed algorithm of HW-Syntactic distance is available in Appendix B.1.
x j E q q J K 3 j 0 c 8 4 V k 7 0 R L t V r v / T N U y q W Y V 3 5 b 2 8 A H t h J I F < / l a t e x i t > w2 = 0.95 1 How does the models trained on partial syntactic distribution perform on syntactic-specific data?
Based on this syntactic difference metric, we further analyze the performance of models trained on partially observed syntactic data D on different syntactic-specified datasets.
To this end, we first cluster the CaRB sentences into k subsets with the metric of HW-Syntactic Distance3 .Then, we randomly sample 300 sentences in the training set, and calculate the distance between the training set and each subset by averaging the distances among sampled training sentences and each cluster center.We empirically clustered the CaRB sentences into 5 subsets with the optimal distance costs, and partial clustering results are available in Appendix B.3.
We find that the performance of the model on For patients who do not recover quickly, the protocol also includes support groups and/or psychotherapy.
A supportive groups and psychotherapy should also be included in the protocol for patients who are not rapidly recovering.
r 2 E I F V f K + w j 0 e 8 G j U j W v j x r h 9 T z V 6 c s 0 0 v i z j 7 g 2 c c Z i W < / l a t e x i t > IMOJIE < l a t e x i t s h a 1 _ b a s e 6 4 = " d k p M 9 0 Q X c x k s c 7 0 c W X U j Y m t 2 l l 8 = " x q 1 x Z 9 x / p h p 9 m W Y W 3 5 b x 8 A G 2 p 5 i h < / l a t e x i t > IMOJIE < l a t e x i t s h a 1 _ b a s e 6 4 = " + n 8 r 8 + V Y t E G 5 u 6 B p 9 q S j S j q m N 5 l k x 3 R d 3 c + l K V p A w J c Q q 3 K C 4 I M + 3 8 7 L O l P a m u X f X W 1 f F X r V S s 2 r N c m + F N 3 Z I G 7 P w c 5 y C o r p a d 9 b J z s l b a 2 s 5 H X c A i l r B C 8 9 z A F v Z w j I q e + Q M e 8 W R U j R v j 1 r j 7 k B p D u W c B 3 5 Z x / w 6 s q 5 V 9 < / l a t e x i t >

IMOJIE
Figure 4: A case study shows the partial predictions of model trained on the proposed framework.
the subsets gradually increases as the syntactic distance between the training and test subsets decreases.As shown in Table 5, compared to the best performance of 55.3 obtained on the subset CaRB-C5 with a distance of 0.227, the basic model only achieved an optimal F1 score of 47.0 on the subset CaRB-C1.In addition, we find that our fully enhanced model is consistently better than the basic model trained on partial syntactic distribution, suggesting that the proposed training framework improves the syntactic robustness of the OpenIE model comprehensively.We remain more analysis and results of syntactic distribution in Appendix B.2.
To alleviate the problem that neural models rely heavily on labor-intensive annotated data, (Tang et al., 2020) proposes an unsupervised method that pretrains the model on synthetic data automatically labeled by patterns and then refines it using the RL process.

Conclusion
In this paper, we focus on solving the problem of partially observable of syntactic distribution on training data, and propose a syntactically robust training framework that enables OpenIE models to be trained on a syntactic-abundant distribution based on diverse paraphrase generation.We propose a knowledge restoration algorithm to recover the deformed triples in syntactically transformed sentences based on semantic similaritybased matching and syntactic tree walking.To investigate the syntactic robustness of models, we build a syntactically diverse evaluation set that is consistent with the real-world setting.The experimental result with extensive analysis demonstrated the efficiency of our framework.

Figure 1 :
Figure 1: Cluster CaRB into 5 subsets based on the HW-Syntactic Distance and evaluate the IMOJIE model on them.The horizontal axis indicates the indices sorted by the number of samples (above the bars) in the subsets.The left and right vertical axes represent the F1 scores of the model and the distance between the training set and the clustering center of each subset, respectively.

(Figure 2 :
Figure 2: Overview of the proposed framework.Based on the diverse paraphrase candidates set generated by a syntactically controllable model, two algorithms, semantic similarity-based arguments localization and syntactic tree walking, are used to restore the deformed arguments.By taking the arguments as conditions, the predicates are generated with the T5 model.
NP VP .IN NP PRP VBD S S PP , NP VP .IN NP DT NN VBD VPH=3Level-order traverse < l a t e x i t s h a 1 _ b a s e 6 4 = " p E q d H c S 8 9 J X P c 2 M D a I M p f 0 A T 9 b f E P 9 C / 8 M 6 Y g l p E J y Q 5 c + 4 9 Z + b e 6 0 Q e T 4 R p v u a 0 s f G J y a n 8 d G F m d m 5 + o b i 4 d J a E a e y y q ht 6 Y V x 3 7 I R 5 P G B V w Y X H 6 l H M b N / x W M 3 p H c p 4 7 Z r F C Q + D U 9 G P W N O 3 u w H v c N c W R J 3 f t K x 9 0 9 j b u j R b x Z J p m G r p o 8 D K Q A n Z q o T F F 1 y g j R A u U v h g C C A I e 7 C R 0 N O A B R M R c U 0 M i I s J c R V n u E O B t C l l M c q w i e 3 R t 0 u 7 R s Y G t J e e i V K 7 d I p H b 0 x K H W u k C S k v J i x P 0 1 U 8 V c 6 S / c 1 7 o D z l 3 f r 0 d z I v n 1 i B K 2 L / 0 g 0 z / 6 u T t Q h 0 s K t q 4 F R T p B h Z n Z u 5 p K o r 8 u b 6 l 6 o E O U T E S d y m e E z Y V c p h n 3 W l S V T t s r e 2 i r + p T M n K v Z v l p n i X t 6 Q B W z / H O Q r O N g xr 2 7 B O N k v l g 2 z U e a x g F e s 0 z x 2 U c Y Q K q u Q d 4 B F P e N a O t V S 7 1 e 4 / U 7 V c p l n G t 6 U 9 f A D o v Z I D < / l a t e x i t > w1 = 0.95 0 < l a t e x i t s h a 1 _ b a s e 6 4 = " 8 9
t e x i t s h a 1 _ b a s e 6 4 = " e 6 f p + x 2 a F 5 O J K L U r d + L W H 3 C r / y P + g f 6 F d 8 Y U 1 C I 6 I c m Z c + 8 5 M / d e N / Z 5 I i 3 r t c / o H x g c G h 4 Z z Y 2 N T 0 x O 5 a d n j p I o F R 6 r e J E f i a r r J M z n I a t I L n 1 W j Q V z A t d n x 2 5 r U 8 W P L 5 l I e B Q e y u u Y N Q L n P O R n 3 H M k U c 3 8 X F 2 y K y m C 9 s 7 e / u 7 O V u e k X S 8 n v N P M F 6 y D n 3 n D t z 7 / W S w E + l b b 8 M G c M j o 2 P j h Y n J q e m Z 2 a I 5 N 1 9 N 4 0 w w X m F x E I t z z 0 1 5 4 E e 8 I n 0 Z 8P N E c D f 0 A n 7 m d X d U / O y S i 9 S P o 1 N 5 n f B G 6 H Y i v + 0 z V x L V N I t 1 y a + k C H v 7 h 0 c H + 7 v 9 p l m y y 7 Z e 1 i B w c l B C v o 5 j 8 x l 1 t B C D I U M I j g i S c A A X K T 0 1 O L C R E N d A j z h B y N d x j j 4 m y Z u R i p P C J b Z L 3 w 7 t a j k b 0 V 7 l T L W b 0 S k B v Y K c F p b J E 5 N O E F a n W T q e 6 c y K / S 1 3 T + d U d 7 u m v 5 f n C o m V u C D 2 L 9

Figure 4
Figure 4 shows the case study of our proposed framework with different implementations.As is shown, compared to the original training sample, the generated sample exhibit a syntactically different structure.The model trained on the extended dataset with the semantic similarity-based knowledge restoration can only extract two separate triples around the predicate should also be included in.By using the full knowledge restoration algorithms, the trained model can extract all related triples for the predicate.A part of generated samples based on the proposed syntactic robust training framework are shown in Appendix C.

Table 2 :
Evaluation set statistics.The # sent.refers to the total number of sentences, and arg.len /pre.len are the average lengths of argument/predicate of all samples in corresponding data, respectively.

Table 3 ,
we compare

Table 3 :
Experimental results on CaRB.

Table 5 :
Table 4, the best performing model significantly outperforms the base-Experimental results on different subjects of syntactic categories.