RexUIE: A Recursive Method with Explicit Schema Instructor for Universal Information Extraction

Universal Information Extraction (UIE) is an area of interest due to the challenges posed by varying targets, heterogeneous structures, and demand-specific schemas. However, previous works have only achieved limited success by unifying a few tasks, such as Named Entity Recognition (NER) and Relation Extraction (RE), which fall short of being authentic UIE models particularly when extracting other general schemas such as quadruples and quintuples. Additionally, these models used an implicit structural schema instructor, which could lead to incorrect links between types, hindering the model's generalization and performance in low-resource scenarios. In this paper, we redefine the authentic UIE with a formal formulation that encompasses almost all extraction schemas. To the best of our knowledge, we are the first to introduce UIE for any kind of schemas. In addition, we propose RexUIE, which is a Recursive Method with Explicit Schema Instructor for UIE. To avoid interference between different types, we reset the position ids and attention mask matrices. RexUIE shows strong performance under both full-shot and few-shot settings and achieves State-of-the-Art results on the tasks of extracting complex schemas.


Introduction
As a fundamental task of natural language understanding, Information Extraction (IE) has been widely studied, such as Named Entity Recognition (NER), Relation Extraction (RE), Event Extraction (EE), Aspect-Based Sentiment Analysis (ABSA), etc.However, the task-specific model structures hinder the sharing of knowledge and structure within the IE community.
Some recent studies attempt to model NER, RE, and EE together to take advantage of the dependencies between subtasks.Lin et al. (2020); Nguyen * Equal contribution.
† Corresponding author.Instructor (SSI) as inputs and Structured Extraction Language (SEL) as outputs, and proposed a unified text-to-structure generation framework based on T5-Large.While Lou et al. (2023) introduced three unified token linking operations and uniformly extracted substructures in parallel, which achieves new SoTAs on IE tasks.However, they have only achieved limited success by unifying a few tasks, such as Named Entity Recognition (NER) and Relation Extraction (RE), while ignoring extraction with more than 2 spans, such as quadruples and quintuples, thus fall short of being true UIE models.As illustrated in Figure 1 (a), previous UIE can only extract a pair of spans along with the relation between them, while ignoring other qualifying spans (such as location, time, etc.) that contain information related to the two entities and their relation.
Moreover, previous UIE models are short of explicitly utilizing extraction schema to restrict outcomes.The relation work for provides a case wherein the subject and object are the person and organization entities, respectively.Omitting an explicit schema can lead to spurious results, hindering the model's generalization and performance in resource-limited scenarios.
In this paper, we redefine Universal Information Extraction (UIE) via a comprehensive formal framework that covers almost all extraction schemas.To the best of our knowledge, we are the first to introduce UIE for any kind of schema.Additionally, we introduce RexUIE, which is a Recursive Method with Explicit Schema Instructor for UIE.RexUIE recursively runs queries for all schema types and utilizes three unified tokenlinking operations to compute the results of each query.We construct an Explicit Schema Instructor (ESI), providing rich label semantic information to RexUIE, and assuring the extraction results meet the constraints of the schema.ESI and the text are concatenated to form the query.
Take Figure 1 (b) as an example, given the extraction schema, RexUIE firstly extracts "Leonard Parker" classified as a "Person", then extracts "Harvard University" classified as "University" coupled with the relation "Educated At" according to the schema.Thirdly, based on the extracted tuples ( "Leonard Parker", "Person" ) and ( "Harvard University", "Educated At (University)" ), RexUIE derives the span "PhD" classified as an "Academic Degree".RexUIE extracts spans recursively based on the schema, allowing extracting more than two spans such as quadruples and quintuples, rather than exclusively limited to pairs of spans and their relation.
We pre-trained RexUIE on a combination of supervised NER and RE datasets, Machine Reading Comprehension (MRC) datasets, as well as 3 million Joint Entity and Relation Extraction (JERE) instances constructed via Distant Supervision.Extensive experiments demonstrate that RexUIE surpasses the state-of-the-art performance in various tasks and outperforms previous UIE models in fewshot experiments.Additionally, RexUIE exhibits remarkable superiority in extracting quadruples and quintuples.
The contributions of this paper can be summarized as follows: 1. We redefine true Universal Information Extraction (UIE) through a formal framework that covers almost all extraction schemas, rather than only extracting spans and pairs.
2. We introduce RexUIE, which recursively runs queries for all schema types and utilizes three unified token-linking operations to compute the outcomes of each query.It employs explicit schema instructions to augment label semantic information and enhance the performance in low-resource scenarios.
3. We pre-train RexUIE to enhance low-resource performance.Extensive experiments demonstrate its remarkable effectiveness, as RexUIE surpasses not only previous UIE models and task-specific SoTAs in extracting entities, relations, quadruples and quintuples, but also outperforms large language models (such as ChatGPT) under zero-shot setting.

Related Work
Task-specific models for IE have been extensively studied, including Named Entity Recognition (Lample et al., 2016;Yan et al., 2021a;Wang et al., 2021), Relation Extraction (Li et al., 2022;Zhong and Chen, 2021;Zheng et al., 2021), Event Extraction (Li et al., 2021), and Aspect-Based Sentiment Analysis (Zhang et al., 2021;Xu et al., 2021).Some recent works attempted to jointly extract the entities, relations and events (Nguyen et al., 2022;Paolini et al., 2021).OneIE (Lin et al., 2020) firstly extracted the globally optimal IE result as a graph from an input sentence, and incorporated global features to capture the cross-subtask and cross-instance interactions.FourIE (Nguyen et al., 2021) introduced an interaction graph between instances of the four tasks.Wei et al. (2020) proposed using consistent tagging schemes to model the extraction of entities and relationships.Wang et al. (2020) extended the idea to a unified matrix representation.TPLinker formulates joint extraction as a token pair linking problem and introduces a novel handshaking tagging scheme that aligns the boundary tokens of entity pairs under each relation type.
Another approach that has been used to address joint information extraction with great success is the text-to-text language generation model.Lu et al. (2021a) generated the linearized sequence of trigger words and argument in a text-to-text manner.Kan et al. (2022) purposed to jointly extract information by adding some general or task-specific prompts in front of the text.Figure 2: The overall framework of RexUIE.We illustrate the computation process of the i-th query and the construction of the i + 1-th query.M i denotes the attention mask matrix, and Z i denotes the score matrix obtained by decoding.Y i denotes the output of the i-th query, with all outputs ultimately combined to form the overall extraction result.Lu et al. (2022) introduced the unified structure generation for UIE.They proposed a framework based on T5 architecture to generate SEL containing specified types and spans.However, the autoregressive method suffers from low GPU utilization.Lou et al. (2023) proposed an end-to-end framework for UIE, called USM, by designing three unified token linking operations.Empirical evaluation on 4 IE tasks showed that USM has strong generalization ability in zero/few-shot transfer settings.

Redefine Universal Information Extraction
While Lu et al. (2022) and Lou et al. (2023) proposed Universal Information Extraction as methods of addressing NER, RE, EE, and ABSA with a single unified model, their approaches were limited to only a few tasks and ignored schemas that contain more than two spans, such as quadruples and quintuples.Hence, we redefine UIE to cover the extraction of more general schemas.
In our view, genuine UIE extracts a collection of structured information from the text, with each item consisting of n spans s = [s 1 , s 2 , . . ., s n ] and n corresponding types t = [t 1 , t 2 , . . ., t n ].The spans are extracted from the text, while the types are defined by a given schema.Each pair of (s i , t i ) is the target to be extracted.
Formally, we propose to maximize the probabil-ity in Equation 1.
where C n denotes the hierarchical schema (a tree structure) with depth n, A is the set of all sequences of annotated information.t = [t 1 , t 2 , . . ., t n ] is one of the type sequences (paths in the schema tree), and x is the text.s = [s 1 , s 2 , . . ., s n ] denotes the corresponding sequence of spans to t.We use (s, t) i to denote the pair of s i and t i .Similarly, (s, t) <i denotes [s 1 , s 2 , . . ., s i−1 ] and [t 1 , t 2 , . . ., t i−1 ].A i | (s, t) <i is the set of the i-th items of all sequences led by (s, t) <i in A. To more clearly clarify the symbols, we present some examples in Appendix H.

RexUIE
In this Section, we introduce RexUIE: A Recursive Method with Explicit Schema Instructor for Universal Information Extraction.
RexUIE models the learning objective Equation 1 as a series of recursive queries, with three unified token-linking operations employed to compute the outcomes of each query.The condition (s, t) <i Jobs" and "Apple".The right sub-figure shows how to extract the relation given the entity "Steve Jobs" coupled with type "person".The schema is organized as {"person": {"work for (organization)": null}, "organization": null }.The score matrix is separated into three valid parts: token head-tail, type-token tail and token head-type.The cells scored as 1 are darken, the others are scored as 0.
in Equation 1 is represented by the prefix in the i-th query, and (s, t) i is calculated by the linking operations.

Framework of RexUIE
Figure 2 shows the overall framework.RexUIE recursively runs queries for all schema types.Given the i-th query Q i , we adopt a pre-trained language model as the encoder to map the tokens to hidden representations h i ∈ R n×d , where n is the length of the query, and d is the dimension of the hidden states, where P i and M i denote the position ids and attention mask matrix of Q i respectively.. Next, the hidden states are fed into two feedforward neural networks FFNN q , FFNN k .
Then we apply rotary embeddings following Su et al. (2021Su et al. ( , 2022) ) to calculate the score matrix Z i .
where M j,k i and Z j,k i denote the mask value and score from token j to k respectively.P j i and P k i denote the position ids of token j and k. ⊗ is the Hadamard product.R(P k i − P j i ) ∈ R d×d denotes the rotary position embeddings (RoPE), which is a relative position encoding method with promising theoretical properties.
Finally, we decode the score matrix Z i to obtain the output Y i , and utilize it to create the subsequent query Q i+1 .All ultimate outputs are merged into the result set Y = {Y 1 , Y 2 , . . .}.
We utilize Circle Loss (Sun et al., 2020;Su et al., 2022) as the loss function of RexUIE, which is very effective in calculating the loss of sparse matrices where Z i is a flattened version of Z i , and Ẑi denotes the flattened ground truth, containing only 1 and 0.

Explicit Schema Instructor
The i-th query Q i consists of an Explicit Schema Instructor (ESI) and the text x.ESI is a concatenation of a prefix p i and types 1, which is constructed based on the sequence of previously extracted types and the corresponding sequence of spans.t i specifies what types can be potentially identified from x given p i .
We insert a special token [P] before each prefix and a [T] before each type.Additionally, we insert a token [Text] before the text x.Then, the input Q i can be represented as The biggest difference between ESI and implicit schema instructor is that the sub-types that each type can undertake are explicitly specified.Given the parent type, the semantic meaning of each subtype is richer, thus the RexUIE has a better understanding to the labels.
Some detailed examples of ESI are listed in Appendix I.

Token Linking Operations
Given the calculated score matrix Z, we obtain Z from Z by a predefined threshold δ following Token linking is performed on Z , which takes binary values of either 1 or 0 (Wang et al., 2020;Lou et al., 2023).A token linking is established from the i-th token to the j-th token only if Zi,j = 1; otherwise, no link exists.To illustrate this process, consider the example depicted in Figure 3.We expound upon how entities and relations can be extracted based on the score matrix.
Token Head-Tail Linking Token head-tail linking serves the purpose of span detection.if i ≤ j and Zi,j = 1, the span Q i:j should be extracted.The orange section in Figure 3 performs token headtail linking, wherein both "Steve Jobs" and "Apple" are recognized as entities.Consequently, a connection exists from "Steve" to "Jobs" and another from "Apple" to itself.
Token Head-Type Linking Token head-type linking refers to the linking established between the head of a span and its type.To signify the type, we utilize the special token [T], which is positioned just before the type token.As highlighted in the green section of Figure 3, "Steve Jobs" qualifies as a "person" type span, so a link points from "Steve" to the [T] token that precedes "person".Similarly, a link exists from "Apple" to the [T] token preceding "org".

[T] [P] [T] [T]
[P] [T] [T] [P] [T] [T] [Text] [Text] Type-Token Tail Linking Type-token tail linking refers to the connection established between the type of a span and its tail.Similar to token head-type linking, we utilize the [T] token before the type token to represent the type.As highlighted in the blue section of Figure 3, a link exists from the [T] token preceding "person" to "Jobs" due to the prediction that "Steve Jobs" is a "person" span.
During inference, for a pair of token ⟨i, j⟩, if Z i,j ≥ δ, and there exists a [T] k that satisfies Z i,k ≥ δ and Z k,j ≥ δ, we extract the span Q i:j with the type after k.

Prompts Isolation
RexUIE can receives queries with multiple prefixes.To save the cost of time, we put different prefix groups in the same query.For instance, consider the text "Kennedy was fatally shot by Lee Harvey Oswald on November 22, 1963", which contains two "person" entities.We concatenate the two entity spans, along with their corresponding types in the schema respectively to obtain ESI: Inspired by Yang et al. ( 2022), we present Prompts Isolation, an approach that mitigates interferences among tokens of diverse types and prefixes.By modifying token type ids, position ids, and attention masks, the direct flow of information between these tokens is effectively blocked, enabling clear differentiation among distinct sections in ESI.We illustrate Prompts Isolation in Figure 4.For the attention masks, each prefix token can only interact with the prefix itself, its sub-type tokens, and the text tokens.Each type token can only interact with the type itself, its corresponding prefix tokens, and the text tokens.
Then the position ids P and attention mask M in Equation 3 can be updated.In this way, potentially confusing information flow is blocked.Additionally, the model would not be interfered by the order of prefixes and types either.

Pre-training
To enhance the zero-shot and few-shot performance of RexUIE, we pre-trained RexUIE on the following three distinct datasets: Distant Supervision data D distant We gathered the corpus and labels from WikiPedia 1 , and utilized Distant Supervision to align the texts with their respective labels.
Supervised NER and RE data D superv Compared with D distant , supervised data exhibits higher quality due to its absence of abstract or overspecialized classes, and there is no high false negative rate caused by incomplete knowledge base.

MRC data D mrc
The empirical results of previous works (Lou et al., 2023) show that incorporating machine reading comprehension (MRC) data into pre-training enhances the model's capacity to utilize semantic information in prompt.Accordingly we add MRC supervised instances to the pretraining data.
Details of the datasets for pre-training can be found in Appedix G.

Experiments
We conduct extensive experiments in this Section under both supervised settings and few-shot settings.For implementation, we adopt DeBERTaV3-Large (He et al., 2021) as our text encoder, which 1 https://www.wikipedia.org/ also incorporates relative position information via disentangled attention, similar to our rotary module.We set the maximum token length to 512, and the maximum length of ESI to 256.We split a query into sub-queries when the length of the ESI is beyond the limit.Detailed hyper-parameters are available in Appendix B. Due to space limitation, we have included the implementation details of some experiments in Appendix C.
The detailed datasets and evaluation metrics are listed in Appedix A.

Main Results
We first conduct experiments with full-shot training data.Table 1 presents a comprehensive comparison of RexUIE against T5-UIE (Lu et al., 2022), USM (Lou et al., 2023), and previous task-specific models, both in pre-training and non-pre-training scenarios.
We can observe that: 1) RexUIE surpasses the task-specific state-of-the-art models on more than half of the IE tasks even without pre-training.RexUIE exhibits a higher F1 score than both USM and T5-UIE across all the ABSA datasets.Furthermore, RexUIE's performance in the task of Event Extraction is remarkably superior to that of the baseline models.2) Pre-training brings in slight performance improvements.By comparing the outcomes in the last three columns, we can observe that RexUIE with pre-training is ahead of T5-UIE and USM on the majority of datasets.After pre-training, ACE05-Evt showed a significant improvement with an approximately 2% increase in F1 score.This implies that RexUIE effectively utilizes the semantic information in prompt texts and establishes links between text spans and their corresponding types.It is worth noting that the schema of trigger words and arguments in ACE05-Evt is complex, and the model heavily relies on the semantic information of labels.
3) The bottom two rows describe the results of extracting quadruples and quintuples, and they are compared with the SoTA methods.Our model demonstrates significantly superior performance on both HyperRED and Camera-COQE, which shows the effectiveness of extracting complex schemas.

Few-Shot Information Extraction
We conducted few-shot experiments on one dataset for each task, following Lu et al. (2022) and Lou et al. (2023).The results are shown in Table 2.
In general, RexUIE exhibits superior performance compared to T5-UIE and USM in a lowresource setting.Specifically, RexUIE relatively outperforms T5-UIE by 56.62% and USM by 32.93% on average in 1-shot scenarios.The success of RexUIE in low-resource settings can be attributed to its ability to extract information learned during pre-training, and to the efficacy of our proposed query, which facilitates explicit schema learning by RexUIE.

Zero-Shot Information Extraction
We conducted zero-shot experiments on RE and NER comparing RexUIE with other pre-trained models, including ChatGPT2 .We adopt the pipeline proposed by Wei et al. (2023) for Chat-GPT.We used CoNLL04 and CoNLLpp (Wang et al., 2019) for RE and NER respectively.We report Precision, Recall and F1 in Table 3. RexUIE achieves the highest zero-shot extraction performance on the two datasets.Furthermore, we analyzed bad cases of ChatGPT. 1) ChatGPT generated words that did not exist in the original text.For example, ChatGPT output a span "Coats Michael", while the original text was "Michael Coats".2) Errors caused by inappropriate granularity, such as "city in Italy" and "Italy".3) Illegal extraction against the schema.ChatGPT outputs (Leningrad, located in, Kirov Ballet), while "Kirov Ballet" is an organization rather than a location.

Complex Schema Extraction
To illustrate the significance of the ability to extract complex schemas, we designed a forced approach to extract quintuples for T5-UIE, which extracts three tuples to form one quintuple.Details are available in Appendix C.
Table 4 shows the results comparing RexUIE with T5-UIE.In general, RexUIE's approach of directly extracting quintuples exhibits superior performance.Although T5-UIE shows a slight performance improvement after pre-training, it is still approximately 8% lower than RexUIE on F1.We analyse the distribution of relation type versus subject type-object type predicted by T5-UIE as illustrated in Figure 5.

Absense of Explicit Schema
We observe that illegal extractions, such as person, work for, location, are not rare in 1-Shot, and a considerable number of subjects or objects are not properly extracted during the NER stage.Although this issue is alleviated in the 5-Shot scenario, we believe that the implicit schema instructor still negatively affects the model's performance.

Conclusion
In this paper, we introduce RexUIE, a UIE model using multiple prompts to recursively link types and spans based on an extraction schema.We redefine UIE with the ability to extract schemas with any number of spans and types.Through extensive experiments under both full-shot and fewshot settings, we demonstrate that RexUIE outperforms state-of-the-art methods on a wide range of datasets, including quadruples and quintuples extraction.Our empirical evaluation highlights the significance of explicit schemas and emphasizes that the ability to extract complex schemas cannot be substituted.

Limitations
Despite demonstrating impressive zero-shot entity recognition and relationship extraction performance, RexUIE currently lacks zero-shot capabilities in events and sentiment extraction due to the limitation of pre-training data.Furthermore, RexUIE is not yet capable of covering all NLU tasks, such as Text Entailment.A Detailed Supervised Datasets for Downstream Tasks The detailed datasets and evaluation metrics are listed in Table 5.We explain the evaluation metrics as follows.
Entity Strict F1 An entity mention is correct if its offsets and type match a reference entity.
Relation Strict F1 A relation is correct if its relation type is correct and the offsets and entity types of the related entity mentions are correct.
Relation Triplet F1 A relation is correct if its relation type is correct and the string of the related entity mentions are correct.
Event Trigger F1 An event trigger is correct if its offsets and event type matches a reference trigger.
Event Argument F1 An event argument is correct if its offsets, role type, and event type match a reference argument mention.
Sentiment Strict F1 For triples, a sentiment is correct if the offsets of its target, opinion and the sentiment polarity match with the ground truth.For quintuples, a sentiment is correct if the offsets of its subject, object, aspect, opinion and the sentiment polarity match with the ground truth.
Quadruple Strict F1 A relation quadruple is correct if the relation type and the type and offsets of its subject, object, qualifier match with the ground truth.

B Implementation Details
We download the supervised data for pre-training from HuggingFace3 .For all the downstream datasets, we follow the procedure by Lu et al. (2022); Lou et al. (2023) and then convert them to  the input format of RexUIE.We implement the pretraining model and trainer based on Transformers (Wolf et al., 2020).We adopt DeBERTaV3-Large (He et al., 2021) as our text encoder.We set the maximum token length to 512, and the maximum length of prompt to 256 .We split a query into subqueries containing prompt text segments when the length of the prompt text is beyond the limit.Our model is optimized by AdamW (Loshchilov and Hutter, 2017), with weight decay as 0.01, threshold δ as 0. We set the clip gradient norm as 2, warmup ratio as 0.1.The hyper-parameters for grid search are listed in Table 6.

C Details of Experiments Settings
Few-Shot IE We conducted few-shot experiments on one dataset for each task, following Lu et al. (2022) and Lou et al. (2023).Specifically, we sample 1/5/10 sentences for each type of entity/relation/event/sentiment from the training set.
To avoid the influence of sampling noise, we repeated each experiment 10 times with different samples.
Zero-Shot IE We used CoNLL04 for RE due to its unique subject and object entity types for each relation.For NER, we employed CoNLLpp (Wang et al., 2019), which is a corrected version of the CoNLL2003 NER dataset.In order to prevent the performance of ChatGPT from being affected by randomly selected instructions, we adopted the SoTA zero-shot information extraction framework with ChatGPT proposed by Wei et al. (2023).
Complex Schema Extraction T5-UIE was initially limited to extracting only triples.We designed a forced approach to extract quintuples for T5-UIE, which extracts three tuples to form one quintuple.
The quintuple in COQE can be represented as (subject, object, aspect, opinion, sentiment).We propose to model the quintuple extraction as extracting three triples: (subject, "subject-object", object), (object, "object-aspect", aspect), and (aspect, sentiment, opinion).We conducted an ablation experiment on RexUIE to explore the influence of Prompts Isolation and rotary embedding on the model, where RexUIE is not pre-trained.The results are listed in Table 7.

D Ablation Study
The experimental results demonstrate that removing Prompts Isolation leads to a decrease in the performance of RE, while the exclusion of rotary embedding results in detrimental effects on both relation and sentiment extraction.Overall, the complete RexUIE exhibits superior performance.Removing Prompts Isolation or rotary embedding results in a slight decline in performance, with the most significant drop observed when both are deleted.

F Insights to the Schema Complexity and Training Data Size
Under the full-shot setting, the improvement of RexUIE compared to the previous UIE models is not significant (only 1% across 4 tasks and 14 metrics).At the same time, we have also found that for different tasks or datasets, the improvement of RexUIE seems to exhibit some randomness, which may be related to several factors such as schema complexity, training data size, task type, and the extent of task exploration.Among these factors, we believe that schema complexity and training data size are more important, so we conducted a statistical analysis to better summarize the patterns.
(We only consider the case of full-shot without pre-training to avoid the influence of pre-training.) • Schema complexity: Due to ESI and recursive strategies, we intuitively believe that RexUIE has certain advantages in handling complex schemas.We use the number of leaf nodes in the schema to represent the complexity of the schema, noted as C.
• Training data size: We know that as the training data size increases, the differences between the performance of models will be nar-row.Therefore, we believe that the performance improvement is negatively correlated with training data size.We note the training data size as S.
To investigate the pattern, we introduce a media variable log(10000 × C S ).After removing certain outliers and event-trigger datasets, we find a positive correlation between log(10000 × C S ) and the relative improvement in Table 9, which supports our hypothesis.

G Pre-training Data
D distant We remove abstract and overspecialized entity types and relation types (such as "structural class of chemical compound") and remove categories that occur less than 10000 times.We also remove the examples that do not contain any relations.Finally, we collect 3M samples containing entities and relations as D distant .
D mrc Specifically, we collect SQuAD (Rajpurkar et al., 2016) andHellaSwag (Zellers et al., 2019) together as D mrc .The MRC data is constructed with pairs of questions and answers.For implementations, we use the question as the type, and consider the answer as the span to extract.

H Example of Schema
Schema examples for some datasets are listed in Table 10.

I Query Example
Some query examples are listed in Table 11 and  Table 12.

Figure 1 :
Figure 1: Comparison of RexUIE with previous UIE.(a) The previous UIE models the information extraction task by defining the text spans and the relation between span pairs, but it is limited to extracting only two spans.(b) Our proposed RexUIE recursively extracts text spans for each type based on a given schema, and feeds the extracted information to the following extraction.

Figure 3 :
Figure3: Queries and score matrices for NER and RE.The left sub-figure shows how to extract entities "Steve Jobs" and "Apple".The right sub-figure shows how to extract the relation given the entity "Steve Jobs" coupled with type "person".The schema is organized as {"person": {"work for (organization)": null}, "organization": null }.The score matrix is separated into three valid parts: token head-tail, type-token tail and token head-type.The cells scored as 1 are darken, the others are scored as 0.

Figure 4 :
Figure4: Token type ids, position ids and Attention mask for RexUIE.p and t denote the prefix and types of the first group of previously extracted results.q and u denote the prefix and types for the second group.
[CLS][P]person: Kennedy [T] kill (person) [T] live in (location). . .[P] person: Lee Harvey Oswald [T] kill(person) [T] live in (location). . . .However, the hidden representations of type kill (person) should not be interfered by type live in (location).Similarly, the hidden representations of prefix person: Kennedy should not be interfered by other prefixes (such as person: Lee Harvey Oswald) either.

Figure 5 :
Figure 5: The distribution of relation type versus subject type-object type predicted by T5-UIE.We circle the correct cases in orange.
[P][T] location[T] miscellaneous[T] organization[T] person[Text] EU rejects German call to boycott British lamb .[SEP]CoNLL04 0 [CLS][P][T] location[T] organization[T] other[T] people[Text] The self-propelled rig Avco 5 was headed to shore with 14 people aboard early Monday when it capsized about 20 miles off the Louisiana coast , near Morgan City , Lifa said.[SEP][CLS][P] location: Morgan City[T] located in ( location )[P] location: Louisiana[T] located in ( location )[P] people: Lifa[T] kill ( people )[T] live in ( location )[T] work for ( organization )[Text] The self-propelled rig Avco 5 was headed to shore with 14 people aboard early Monday when it capsized about 20 miles off the Louisiana coast , near Morgan City , Lifa said.[SEP][P][T] acquit[T] appeal[T] arrest jail[T] attack[T] born[T] charge indict[T] convict[T] declare bankruptcy[T] demonstrate[T] die[T] divorce[T] elect[T] end organization[T] end position[T] execute[T] extradite[T] fine[T] injure[T] marry[T] meet[T] merge organization[T] nominate[T] pardon[T] phone write[T] release parole[T] sentence[T] start organization[T] start position[T] sue[T] transfer money[T] transfer ownership[T] transport[T] trial hearing[Text]The electricity that Enron produced was so exorbitant that the government decided it was cheaper not to buy electricity and pay Enron the mandatory fixed charges specified in the contract .[SEP][CLS][P] transfer money: pay[T] beneficiary[T] giver[T] place[T] recipient[Text]The electricity that Enron produced was so exorbitant that the government decided it was cheaper not to buy electricity and pay Enron the mandatory fixed charges specified in the contract .[SEP]1[CLS][P][T] acquit[T] appeal[T] arrest jail[T] attack[T] born[T] charge indict[T] convict[T] declare bankruptcy[T] demonstrate[T] die[T] divorce[T] elect[T] end organization[T] end position[T] execute[T] extradite[T] fine[T] injure[T] marry[T] meet[T] merge organization[T] nominate[T] pardon[T] phone write[T] release parole[T] sentence[T] start organization[T] start position[T] sue[T] transfer money[T] transfer ownership[T] transport[T] trial hearing[Text] and he has made the point repeatedly in interview after interview that he has never claimed to speak for god , nor has he claimed that this is " god ś war "[SEP] [CLS][P] attack: war[T] attacker[T] instrument[T] place[T] target[T] victim[Text] and he has made the point repeatedly in interview after interview that he has never claimed to speak for god , nor has he claimed that this is " god ś war "[SEP]

Table 1 :
F1 result for UIE models with pre-training.* -Trg means evaluating models with Event Trigger F1, * -Arg means evaluating models with Event Argument F1, while detailed metrics are listed in Appendix B. T5-UIE and USM are the previous SoTA UIE models proposed by Lu et al. (2022) and Lou et al. (2023), respectively.

Table 3 :
Zero-Shot performance on RE and NER.* indicates that the experiment is conducted by ourselves.
joint entity and relation extraction.In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 185-197, Online and Punta Cana, Dominican Republic.Association for Computational Linguistics.

Table 5 :
Detailed supervised datasets and evaluation metrics for each task.