IELM: An Open Information Extraction Benchmark for Pre-Trained Language Models

We introduce a new open information extraction (OIE) benchmark for pre-trained language models (LM). Recent studies have demonstrated that pre-trained LMs, such as BERT and GPT, may store linguistic and relational knowledge. In particular, LMs are able to answer “fill-in-the-blank” questions when given a pre-defined relation category. Instead of focusing on pre-defined relations, we create an OIE benchmark aiming to fully examine the open relational information present in the pre-trained LMs. We accomplish this by turning pre-trained LMs into zero-shot OIE systems. Surprisingly, pre-trained LMs are able to obtain competitive performance on both standard OIE datasets (CaRB and Re-OIE2016) and two new large-scale factual OIE datasets (TAC KBP-OIE and Wikidata-OIE) that we establish via distant supervision. For instance, the zero-shot pre-trained LMs outperform the F1 score of the state-of-the-art supervised OIE methods on our factual OIE datasets without needing to use any training sets.


Introduction
Pre-trained language models (LM), such as BERT (Devlin et al., 2018) and GPT-3 (Brown et al., 2020), have revolutionized NLP over the last several years and advanced the state-of-theart results in a wide set of downstream NLP tasks.Recent studies show that a considerable amount of linguistic (Hewitt and Manning, 2019;Clark et al., 2019) and relational knowledge (Petroni et al., 2019;Talmor et al., 2019;Jiang et al., 2020;Petroni et al., 2020) has been captured by the pretrained LMs via pre-training on large-scale textual corpora.These approaches often design "fill-in-theblank" questions based on pre-defined relations.For example, a question "Bob Dylan was born in _" is manually created for the LMs to answer the "birthplace" relation of "Bob Dylan".
Most existing approaches that evaluate what pretrained LMs have learned are based on benchmarks with pre-defined relation categories.Yet, the benchmarks present two limitations.First, most benchmarks only cover a limited number of pre-defined relations.Therefore, it is unclear whether the pretrained LMs have stored general open relation information.For example, the Google-RE in LAMA benchmark (Petroni et al., 2019) includes only three relations (i.e., "birthplace", "birthdate", and "deathplace"), while there are hundreds of relations available in the real world scenario.Second, a majority of benchmarks evaluate LMs in a close manner.This means that the gold relation is given to the models.For example, "was born in" is given as the model's input.Besides, the existing benchmarks often provide a single gold relation for each input sentence.However, an input sentence may indicate multiple relations, e.g., containing both "birthplace" and "birthdate" information about an argument or entity.We are curious: instead of the limited relational information, can we systematically benchmark the general information stored in the pre-trained LMs?
In this work, we set up a new open information extraction (OIE) benchmark, called IELM, towards testing the general and open relational information stored in pre-trained LMs.We refer to OIE as it is a task that is designed to extract open relations from massive corpora without requiring a pre-defined relation category.As shown in Figure 1, we successfully convert pre-trained LMs to zero-shot OIE systems.We apply them to two standard OIE datasets, including CaRB (Bhardwaj et al., 2019) and Re-OIE2016 (Stanovsky and Dagan, 2016;Zhan and Zhao, 2020), as well as two new large-scale factual OIE datasets in our IELM benchmark.We show that the zero-shot pre-trained LMs outperform the fully supervised state-of-the-arts on fac- formation extraction system takes a noun phrase (NP) chunked sentence as input, and outputs a set of triples.The approach first conducts argument extraction by encoding NPs as argument pairs, then performs predicate extraction via decoding using the parameters (i.e., attention scores) of the pre-trained language models.The output extractions are then evaluated on our IELM benchmark.
tual OIE datasets.Standard OIE datasets rely on human annotations and often consist of thousands of gold triples and sentences.Unlike those datasets, we create two large-scale OIE datasets, namely TAC KBP-OIE and Wikidata-OIE, via distant supervision from knowledge graphs.For example, Wikidata-OIE is constructed via aligning English Wikipedia to Wikidata triples, resulting in millions of triples and documents.The design of zero-shot LMs for OIE is important: by encoding the noun chunks as arguments in the input, we only make use of the parameters of pre-trained LMs to decode the predicates (or relations) between the arguments.
To the best of our knowledge, this is the first attempt to systematically evaluate pre-trained LMs in a zero-shot OIE setting.To summarize, our key contributions are the following.
1. We benchmark the general relational information in pre-trained LMs on our IELM benchmark.Besides two standard OIE datasets (CaRB and Re-OIE2016), we also create two largescale factual OIE datasets for our benchmark.
The two new OIE datasets are called TAC KBP-OIE and Wikidata-OIE, which are constructed via distant supervision from two knowledge graphs (TAC KBP and Wikidata).Our benchmark is a general OIE benchmark, helping the development of future OIE systems.
2. We enable the zero-shot capabilities of pretrained LMs for OIE by encoding the arguments in the input and decoding predicates using the parameters of pre-trained LMs.The pre-trained LMs are particularly good at recovering factual arguments and predicates.
3. We test the OIE performance of 6 pre-trained LMs (BERT and GPT-2 (Radford et al., 2019) families) and 14 OIE systems on IELM benchmark.The zero-shot LMs achieve state-of-theart OIE performance on TAC KBP-OIE and Wikidata-OIE, even outperforming fully supervised OIE systems.

Language Models as Zero-Shot Information Extractors
For open information extraction (OIE), we take an input as a NP-chunked sentence and output a set of triples.Below is an example.
Input Dylan NP was born in Minnesota NP , and was awarded Nobel Prize NP .Output (Dylan; born in; Minnesota), (Dylan; awarded; Nobel Prize).
NP denotes the noun phrase.

Argument Extraction
Follow traditional linguistic OIE systems such as Stanford OpenIE (Angeli et al., 2015) and Ope-nIE5 (Saha et al., 2017(Saha et al., , 2018)), we treat each NP pair as an argument pair (e.g., "Dylan" and "Minnesota").We then utilize the parameters of LMs to extract the predicates (e.g., "born in") between the pair in the input as below.

Predicate Extraction
The predicate extraction problem is formulated as extracting a set of sequences in the input that are associated with an argument pair.We particularly use the attention scores in a pre-trained LM to measure the relevance between a sequence and the argument pair.We frame the process as a search problem.Given an argument pair, we aim to search for the sequences with the largest attention scores connecting the pair.To compute a score for every possible sequence is computationally expensive especially when the sequence length is large, the exhaustive search is therefore intractable.We adapt beam search as an approximate strategy to efficiently explore the search space.Beam search maintains the k-best candidates.This means the time cost of beam search does not depend on the sequence length, but on the size of the beam and ( Dylan was born in Minnesota ) Step 0 Step 1 Step 2 Step 3 Step  the average length of the candidates.In general, the beam search starts with the first argument (e.g., "Dylan").At each step, beam search simply selects top-k next tokens with the largest attention scores, and just keeps k partial candidates with the highest scores, where k is the beam size.When a candidate produces the second argument (e.g., "Minnesota"), the candidate is complete.We show a running example as follows.Let's first consider the search from left to right with beam size equal to 1.An example search process is shown in Figure 2. Given an argument pair "Dylan" and "Minnesota", at each step, the search performs one of the following actions: • START the search from first argument.The first argument is added as an initial candidate into the beam.In Figure 2(a), at step 0, "Dylan" is added into the beam.The total attention score is initialized to 0. Step 2 takes the same action, and the score becomes 0.5.
• STOP the search step if the candidate has reached the second argument, then add the candidate as a valid triple into the beam.As beam size equals to 1, (Dylan; born in; Minnesota) is returned for the given pair.The final score of the triple is 0.7.
We also notice triples are often in reverse order in the sentence, thus enabling bidirectionality by running the algorithm in both directions (left to right and right to left).We merge the subwords as words, and only consider word-level attention.The beam search is implemented by the breadth-first search, which is efficient as the time complexity is O(k • d).d is the maximum depth of the search tree.

Standard OIE
We adopt two standard OIE datasets below.
Re-OIE2016 Re-OIE2016 (Zhan and Zhao, 2020) is also generated based on the input sentences in the OIE2016, and is further enhanced by human annotations.

Factual OIE
In addition, we introduce two large-scale factual OIE datasets based on knowledge graphs (KG).TAC KBP-OIE TAC Knowledge Base Population (KBP) Slot Filling is a task to search a document collection to fill in a target entity for predefined relations (slots) with a given entity in a reference KG.We adapt the dataset as an OIE dataset.In particular, we use a document sub-collection of the TAC KBP 2013 task (Surdeanu, 2013) as the input, and use the official human annotations regarding the documents as gold extractions.
Wikidata-OIE Besides TAC KBP-OIE, we create a larger factual OIE dataset based on the English Wikipedia.Different from TAC KBP, there are no gold triple annotations for Wikipedia.Since a large amount of Wikidata triples are from English Wikipedia, we create the dataset using distant supervision (Zhang et al., 2017) by aligning Wikidata triples to Wikipedia text.We employ an unsupervised entity linker based on a pre-built mentionto-entity dictionary (Spitkovsky and Chang, 2012) to extract potential gold arguments for scalability considerations.The entity linker links an arbitrary entity mention in a sentence to a Wikipedia anchor, which is further linked to a Wikidata entity.For each sentence in Wikipedia articles containing two linked arguments, if there is a Wikidata triple describing a relation holding the two arguments, we denote the Wikidata triple as a gold extraction.
Unlike TAC KBP-OIE which is built based on human annotations, Wikidata-OIE is derived from automatic annotations.Therefore, we evaluate our unsupervised entity linker on the standard AIDA benchmark (Hoffart et al., 2011) consisting of Wikipedia entities.Table 1 shows that it significantly improves the unsupervised performance (Spitkovsky and Chang, 2012) and reaches competitiveness with a supervised method (Kolitsas et al., 2018).Given the scale of Wikidata-OIE, we sacrifice acceptable effectiveness for efficiency.
The statistics of the datasets are shown in Table 2.For CaRB and Re-OIE2016, we report the statistics of the corresponding test sets.We include a dataset  We consider GPT-2 (Radford et al., 2019), where h t is produced by Transformer decoders (Vaswani et al., 2017).GPT-2 is pre-trained on WebText containing 40GB of text.We explore all four pretrained GPT-2s with different model sizes: GPT-2 (117M), GPT-2 MEDIUM (345M), GPT-2 LARGE (774M), and GPT-2 XL (1558M).
Bidirectional Language Models Different from unidirectional LMs that predict the next word given the previous words, bidirectional LMs take both left and right context of the target word into consideration, formally, p(x t ) = p(x t |x 1 , . . ., x t−1 , x t+1 , . . ., x N ).
We use BERT (Devlin et al., 2018) that enables bidirectional context modeling via a masked LM objective and utilizing the Transformer architecture.BERT is pre-trained on BooksCorpus and English Wikipedia.We use both its pre-trained settings: BERT BASE (109M) and BERT LARGE (335M).

Comparison Methods
We compare our method with a wide set of OIE systems including both neural and traditional linguistic OIE systems.Most OIE systems are based on supervised learning, which are indicated with asterisks (*) in Table 3.We provide details of the comparison systems in Appendix A.5.

Standard OIE
On CaRB and Re-OIE2016, we follow the original evaluation proposed in (Bhardwaj et al., 2019) and (Stanovsky and Dagan, 2016;Zhan and Zhao, 2020), and report precision, recall, F1, area under the curve (AUC) for compared OIE systems.AUC is calculated from a plot of the precision and recall values for all potential confidence thresholds.The F1 is the maximum value among the precisionrecall pairs.We follow the matching function proposed for each dataset, i.e., lexical match for Re-OIE2016, and tuple match for CaRB.The CaRB evaluation function is stricter as it penalizes long extractions.

Factual OIE
We report precision, recall, and F1 of the OIE systems on two large-scale factual OIE datasets: TAC KBP-OIE and Wikidata-OIE.We introduce exact match as the matching function for them as below.
Matching Function The matching functions for standard OIE datasets are generally flexible.For example, the lexical match of Re-OIE2016 judges an argument or predicate as correct if and only if it includes the syntactic head of the gold argument or predicate.Unlike these matching functions, our exact matching function requires both arguments and predicates are linked to the gold extractions.
For TAC KBP-OIE, we judge an argument to be correct if and only if it matches the name of the gold argument and the span position of the gold argument in the sentence.The main challenge is how to properly link a predicate, since there are often many ways to express it.We follow Stanford OpenIE (Angeli et al., 2015) to produce the predicate mapping between the OIE relations and TAC KBP predicates.A predicate is correct if the pair of OIE relation and gold predicate exists in the predicate mapping.The predicate mapping is constructed in two steps.First, a collection of predicate mappings was constructed by a single annotator in approximately a day.Second, predicate mappings were finalized through the following learning procedure.This process matches OIE relations to the TAC KBP predicates by searching for co-occurrent relations in a large distantly-labeled corpus, and decides pairs of OIE relations and TAC KBP predicates that have a high PMI 2 .The basic idea is that the more often the argument pairs of the triples and TAC KBP triples are linked, the more likely the corresponding relations or predicates are linked to each other.Example predicate mappings are shown in Appendix A.4.
For Wikidata-OIE, we link an argument based on the entity linker used in Wikidata-OIE construction (Sec.3.1).An argument is correct if the linked argument matches the name of the gold argument and the span position of the gold argument in the sentence.The predicate mapping is bootstrapped from TAC KBP-OIE's mapping.In addition, we normalize each predicate phrase of the triples by lemmatization, and removing inflection, auxiliary verbs, adjectives, adverbs.One author manually filters out the bad predicate pairs.This process takes approximately a day.A predicate is correct if the OIE to gold predicate pair exists in the bootstrapped predicate mapping.An annotator randomly subsampled and checked 100 aligned triple-sentence pairs and concluded with a 93% accuracy of extracted triples.
Metrics We use the official scorer of TAC KBP Slot Filling 2013 to calculate precision, recall, and F1 for TAC KBP-OIE.Besides, like previous OIE systems, for LMs, we adopt two constraints from ReVerb (Fader et al., 2011), the antecessor of Ope-nIE5: (i) the frequency of predicates must be above a threshold aiming to avoid triples to be overspecified, and (ii) a predicate must be a contiguous sequence in the sentence avoiding predicates that have no meaningful interpretation.We set these parameters according to Sec. 4.4.
We only report precision, recall, and F1 based on the parameter study in Sec.4.4 for pre-trained LMs on the IELM benchmark.We do not compute AUC as pre-trained LMs are treated as zeroshot OIE systems.We therefore do not tune the results with respect to different confidence.Our main focus is to benchmark the OIE performance of the LMs under a unified setting.Another reason is that it is computationally expensive to get the AUC on the two large-scale datasets: TAC KBP-OIE and Wikidata-OIE.We also do not report AUC for the compared OIE systems on TAC KBP-OIE and Wikidata-OIE.Instead, we use the confidence threshold that obtains the best F1 on Re-OIE2016 to compute the scores.

Results
In this section, we show that pre-trained LMs are effective zero-shot OIE systems, and exceed the previous state-of-the-art OIE systems on our largescale factual OIE datasets in IELM benchmark.To keep our evaluation as simple as possible, the hyperparameters and settings are shared across datasets.More experimental details are described in the appendix.

OIE Results
Table 3 shows the results.While zero-shot OIE systems synthesized by pre-trained LMs obtain notably lower scores compared to previous OIE systems on standard OIE datasets, they outperform the previous OIE systems on factual OIE datasets.
We also find that larger LMs obtain improved results on all datasets.For example, BERT LARGE outperforms BERT BASE due to its larger model size.GPT-2s share similar trends.This is because larger LMs store richer relational information.This finding is consistent with previous studies (Petroni et al., 2019(Petroni et al., , 2020)).

Standard OIE
The main reasons for the degraded performance of pre-trained LMs on standard OIE datasets are three-fold.First, the comparison methods mainly involve supervised systems that are trained on OIE datasets, which are denoted with asterisks (*) in Compared to the results on standard OIE datasets, pre-trained LMs consistently achieve state-of-the-art performance on both datasets.Both datasets emphasize measuring factual arguments and predicates in the reference KGs.Previous studies (Petroni et al., 2019(Petroni et al., , 2020) ) show that LMs have stored a considerable amount of factual information via pre-training on large-scale text.We draw the same conclusion.To the best of our knowledge, our IELM benchmark is the first benchmark that includes factual OIE datasets.More importantly, both linguistic and neural OIE systems are derived from manually designed linguistic patterns or learned patterns.The result shows that the pre-trained attention weights capture a more flexible set of factual patterns.The result also suggests that our approach is capable of using such patterns.In order to scale our approach to large-scale datasets, the argument and predicate extraction are both efficient by design.In particular, the beam search for predicate extraction is efficient in exploring the relational sequences in the input sentence.Besides, the attention scores used in the beam search are produced via a single forward pass of the pre-trained LM over the input sentence without fine-tuning.
Moreover, we find that BERT LMs outperform GPT-2 LMs under similar model sizes.On both datasets, BERT BASE performs better than GPT-2 in F1, and BERT LARGE outperforms GPT-2 MEDIUM in F1.This is mainly because the recall of BERT LMs is higher than that of corresponding GPT-2 LMs.The result indicates that the Cloze-style loss function (i.e., masked LM) of BERT is more effective and flexible in recovering information than the autoregressive LM objective.We also notice that the precision of GPT-2 LMs is higher than that of BERT LMs.The reason is that the autoregressive LM objective captures more accurate information than Cloze-style loss does by preventing extra noise (e.g., masked tokens) in pre-training.
Pre-trained LMs achieve competitive precision, e.g., the precision is greater than 60% on TAC KBP-OIE.However, only moderate recalls are obtained.Therefore, improving recall is clearly the future direction.We find that both argument and predicate extraction can be further improved.For example, the main cause of the moderate recall is the incorrect arguments caused by spaCy noun chunks as summarized in Sec.4.2.Besides, we can incorporate predicates that are not between the argument pairs into the extractions, as we observe a number of gold triples are in inverted sentences.We also notice that the F1 gain over previous state-ofthe-arts on TAC KBP-OIE is smaller compared to that on Wikidata-OIE.A larger text corpus, e.g., Wikipedia, provides more information.We could improve the recall by running on larger corpora such as WebText2 and Common Crawl (Raffel et al., 2019;Brown et al., 2020) to collect more triples.

Error Analysis
There is still significant room to improve the results.We argue that we are measuring a lower bound for what LMs know.To further understand the shortcomings of the current method, we conduct an error analysis of the errors in precision on all datasets.We choose BERT LARGE for the study.We sample 100 documents from the Wikidata-OIE dataset, and manually check the reasons for the errors.We find 33.1% of the errors are caused by incorrect arguments, while the predicate phrases are correct.The errors are due to the incorrect noun chunks detected by the spaCy.18.3% of the errors are due to the missing pairs in predicate mapping.We also note that approximately 23.8% of the errors are actually correct triples that are not covered by Wikidata.For example, (Bob_Dylan, residence, Nashville) does not exist in Wikidata, but it is a correct triple.The rest of the errors made by BERT LARGE are incorrect predicate phrases, such as uninformative phrases.We find similar errors are made by BERT LARGE on other datasets.Based on the above analysis, enhancing argument detection and predicate mapping is helpful to further improve the results.

Runtime Analysis
The runtime of OIE systems is crucial in practice.We test the runtime of different OIE systems on Re-OIE2016.The results are in Table 4.We find ours is competitive in terms of efficiency given the size of the models.

Parameter Study
We study the effects of the key parameters using BERT BASE on TAC KBP-OIE as shown in  dataset to tune the parameters, and use the best parameter setting achieved for all experiments.When studying the effect of a certain parameter, we keep the remaining parameters as default.We use F1 to measure the effects.Additional details are described in Appendix A.2.3.

Related Work
Pre-trained language models (LM), e.g., BERT (Devlin et al., 2018), GPT (Radford et al., 2018(Radford et al., , 2019)), and large LMs over 100B parameters (Brown et al., 2020;Chowdhery et al., 2022;Zeng et al., 2022) contain growing amount of linguistic and factual knowledge obtained via pre-training on large-scale corpora.To evaluate their abilities, researchers have created many knowledge benchmarks.LAMA leverages manually created prompts (Petroni et al., 2019(Petroni et al., , 2020)).Recent studies have also developed soft prompts (Liu et al., 2021;Zhong et al., 2021)) for fact retrieval.KILT (Petroni et al., 2021) proposes a knowledge-intensive benchmark concerning several downstream tasks to evaluate LMs' ability in capturing knowledge.Wang et al. (2022) have utilized a set of knowledge-intensive structure prediction tasks to evaluate the knowledge in pre-trained LMs.Shen et al. (2022) have adapted KG completion as a benchmark to evaluate LMs.Besides relational knowledge, closedbook OpenQA (Roberts et al., 2020) benchmarks (in which LMs answer the open-domain questions without retrieving contexts) have also been adopted as a way to evaluate LMs' knowledge.While the existing benchmarks evaluate LMs in an implicit way, the main difference is that our benchmark explicitly and interpretably evaluates triples from the textual corpora extracted using model parameters (e.g.attentions).In the field of neural network interpretation (Linzen et al., 2016;Adi et al., 2016;Tenney et al., 2019), in particular the pre-trained deep LM analysis, substantial recent work focuses on both visualizing and analyzing the attention (Vig, 2019;Jain and Wallace, 2019;Clark et al., 2019;Michel et al., 2019;Ramsauer et al., 2020).Instead of analyzing or visualizing, our benchmark quantitatively evaluates the relational information with respect to open information extraction.
Open information extraction systems, e.g., OL-LIE (Schmitz et al., 2012), Reverb (Fader et al., 2011), Stanford OpenIE (Angeli et al., 2015), Ope-nIE 5 (Saha et al., 2017(Saha et al., , 2018)), RnnOIE (Stanovsky et al., 2018), and OpenIE 6 (Kolluru et al., 2020a) aim to extract triples from web corpora for open schema KGs.Besides, NELL (Carlson et al., 2010), DeepDive (Niu et al., 2012), Knowledge Vault (Dong et al., 2014) extract information based on a fixed schema or ontology, where humans help improve the accuracy of the extractions.Probase (Wu et al., 2012) produces taxonomies instead of rich typed relations in general KGs.Our benchmark first evaluates LMs' unsupervised information extraction ability on common open information extraction datasets such as CaRB (Bhardwaj et al., 2019) and Re-OIE2016 (Zhan and Zhao, 2020), and then aligns the extracted triples to KG triples for large-scale knowledge extraction benchmark construction.Our algorithm is similar to the generation algorithm in DeepEx (Wang et al., 2021).The focus of this work is to benchmark the zero-shot OIE performance of pre-trained LMs on both standard and factual OIE datasets.To further improve the OIE performance, the ranking module in DeepEx can be useful.The structure pre-training proposed in (Wang et al., 2022) can also be helpful.

Conclusion
We benchmark the general relational information in pre-trained language models (LM) in an open information extraction (OIE) setup.We find that the pretrained LMs contain a considerable amount of open relational information through large-scale evaluation on both standard OIE datasets and newly created large-scale factual OIE datasets in our IELM benchmark.We are able to turn pre-training LMs into zero-shot OIE systems to efficiently deliver the benchmark results.The reach of this result is broad and has potential downstream utility for deep neural network interpretation, information extraction, and knowledge graph construction.Although the results are promising, we argue that our results just indicate a lower bound about what the LMs have.We hope our results will foster further research in the LM OIE benchmark direction.

Limitations
For the limitations of our method, the argument extraction module of our algorithm relies on a thirdparty noun chunker.As reported, the noun chunker introduces the majority of the errors in our extraction results.A limitation in our benchmark is that we have not conducted a large-scale manual evaluation of our factual OIE datasets (TAC KBP-OIE and Wikidata-OIE).The main focus of our study is to provide a large-scale OIE benchmark.As a result, this makes our benchmark more challenging to be used than standard OIE datasets in terms of computation costs and infrastructure.Finally, we have only benchmarked BERT and GPT-2 on our datasets.Future work could include testing a wide range of language models on our benchmark.

Ethical Considerations
We hereby acknowledge that all of the co-authors of this work are aware of the provided ACM Code of Ethics and honor the code of conduct.This work is about benchmarking the zero-shot OIE capability of pre-trained language models including BERT and GPT.Our ethical considerations and the work's underlying future impacts are discussed in the following perspectives.Language models are known to present potential risks and limitations (Brown et al., 2020), and the corpus used in pre-training (such as Wikipedia) may introduce unwanted biases and toxicity.We do not anticipate the production of harmful outputs after using our method or datasets, especially for vulnerable populations.

Environmental Impact
We adopt the pre-trained language models BERT (Devlin et al., 2018) and GPT-2 series (Radford et al., 2019) in our IELM benchmark.The models' carbon footprints are estimated to be 22-28 kilograms (Gibney, 2022).Additionally, The focus of this study is to test the zero-shot OIE ability of pre-trained language models.We do not train language models on massive datasets.Instead, we only do inference on a few evaluation datasets.This cost is less than 0.1% energy than that of their pre-training.This demonstrates that developing proper zero-shot learning strategies for large language models can not only deepen our understanding of their latent mechanisms, but also further reduce the energy consumption and environmental impacts that language models with ever-growing size may cause.

A The IELM Benchmark Details
Additional details of our open information extraction (OIE) benchmark IELM are described in this section.

A.1 Wikidata-OIE
In this section, we describe some technical details regarding the construction and evaluation of Wikidata-OIE.

A.1.1 Entity Linking
We use an unsupervised entity linker for both Wikidata-OIE dataset construction and OIE evaluation.The entity linker is originally developed in (Spitkovsky and Chang, 2012), which is based on a mention-to-entity dictionary.We build an enhanced dictionary as follows: we add new Wikipedia anchors to the dictionary which results in 26 million entries compared to the original 21 million entries.Then a Wikipedia anchor to the Wikidata item dictionary is used to further link the entities (or arguments) to Wikidata.If an argument is a pronoun, we further use neuralcoref2 for coreference resolution.

A.1.2 Predicate Mapping
The predicate mapping of Wikidata-OIE is constructed offline using the method in Sec.3.4.In more detail, we randomly sampled a hold-out dataset including 2,000 documents from English Wikipedia for the bootstrapped predicate mapping construction based on the TAC KBP mapping (Angeli et al., 2015).To filter out the wrong predicate pairs, we manually check whether the top predicate phrases are true.

A.1.3 Gold Triples
For gold triples in Wikidata-OIE, we only preserve those triples describing predicates between arguments that can be linked to corresponding Wikipedia anchors.We rule out triples of attributes about arguments and triples of auxiliary predicates (such as topic's main category.P901) and finally result in 27,368,562 gold triple extractions.

A.1.4 Evaluation
Given the large number of source sentences and gold triples in Wikidata-OIE, a MongoDB database is maintained to store the gold triples to enable an efficient evaluation.

A.2 Zero-Shot Language Model Based Open Information Extraction
In this section, we introduce additional details about how we adapt pre-trained language models (LM) as zero-shot OIE systems.

A.2.1 Argument Extraction
We use spaCy noun chunker3 to annotate the noun phrases in the sentences.
Algorithm 1 Beam search with attention scores.

A.2.3 Parameter Settings
We then discuss the parameter setup of our OIE systems as below.
The parameter settings are shared across all OIE datasets.All the choices are based on the parameter study in Sec.4.4.The beam size of Algorithm 1 is set to 6.The attention score threshold is set to 0.005, and the number of relation/predicate frequencies is set to 10.To generate the attention weight matrix A s of a sentence, we reduce the weights of every attention head in the last layer of pre-trained LMs using the mean operator.We analyze the effects of various parameters below.
Figure 3(a) illustrates the effects of various beam sizes in Algorithm 1.We find that in general, the larger the beam size is, the better F1 the setting achieves.This is because our method is able to reserve more potentially correct triples when more candidates are allowed.However, F1 improvement gradually becomes subtle, while the computation costs increase more significantly.For efficiency consideration, we do not explore larger beam sizes.We set the beam size as 6.
Figure 3(b) compares the effect of different thresholds of the total score.We set the threshold as 0.005 since it achieves the best result.Note that the summed attention score is normalized by the length of the triple to penalize the cumbersome triples.The threshold is effective.This is mainly because of the relational information contained in the self-attention matrix: the score in the attention matrix is representing the chance of the triples to be the true triples based on the stored information.Figure 3(c) shows the impact of the predicate frequency threshold in identifying common predicates.The best result is achieved when it equals 10.This shows that while our method mostly identifies frequent predicates, it is also able to capture some rare predicates.
Figure 3 (d) shows the comparison between the attention weights of the last layer and the mean of all layers.The attention weights of the last layer perform better.This is due to the attention weights in lower layers being low-level linguistic knowledge according to (Clark et al., 2019;Ramsauer et al., 2020), which are less relevant to the relational information.Figure 3(e) compares the impact of different attention reductions, i.e., mean, max, over the attention heads of the last layer.We find the "mean" performs better.The reason is that the token often intensively attends to several specific tokens in the sequence (Michel et al., 2019), and the "mean" operator is less sensitive to such biases.

A.3 The Number of Predicates of Standard and Factual OIE Datasets
Note that there are more predicates in standard OIE datasets than that in factual datasets.This is because, for standard OIE, predicates are open and not attached to a certain schema.These predicates were extracted from the input sentences and are usually natural language utterances.For factual OIE, the predicates are unified into a fixed KG schema.
For example, for a person's birthplace, there are multiple natural language expressions like "was born in" or "gave birth" in standard OIE datasets, while only a single "birth_place" predicate exists in the factual OIE sets.

A.4 Predicate Mapping Examples
We show example predicate mappings in a dictionary below.
The values are the corresponding OIE relations.

A.5 Comparison Systems
We compare our zero-shot OIE systems with the following OIE systems.

A.5.1 Neural OIE Systems
The following neural network based systems are selected.
• SenseOIE (Roy et al., 2019) 6 learns to ensemble various previous unsupervised OIE systems' ex-tractions using supervised learning to combine their strengths.
• SpanOIE (Zhan and Zhao, 2020) 7 presents the Re-OIE2016 datasets for a more rigorous evaluation and a span-based (instead of sequence labeling) extraction model.
• RnnOIE (Stanovsky et al., 2018) 8 is one of the state-of-the-art OIE systems.It uses LSTM to model the OIE problem as a sequence tagging problem, and is trained on a large-scale OIE training set.
• NeuralOIE (Cui et al., 2018) 9 is an encoderdecoder based architecture that adopts the copy mechanism to conduct OIE.
• IMOJIE (Kolluru et al., 2020b) 10 is a sequence generation based OIE model that uses BERT at encoding time.
• Multi 2 OIE (Ro et al., 2020) 11 models OIE as a sequence labeling problem that combines BERT with multi-head attention blocks.
• OpenIE6 (Kolluru et al., 2020a) 12 is one of the state-of-the-art OIE systems.It treats OIE as a 2-D grid labeling task, and trains a BERT family architecture for the task.
Note that while our methods are zero-shot without needing to use the specific training sets, all the neural OIE systems are supervised on corresponding training sets.

A.5.2 Linguistic OIE Systems
We also compare our systems with the following linguistic pattern based systems developed prior to the use of neural networks.
• MinIE (Gashteovski et al., 2017)  13 proposes to minimize facts in OIE by representing information by annotations rather than extraction and removing redundant specific information.
• PropS (Stanovsky et al., 2016) 16 proposes proposition structure which is implied from syntax using dependency trees.
• OpenIE4 (Christensen et al., 2011) 17 is the successor to OLLIE using similar argument and relation expansion heuristics to create OIE extractions from semantic role labeling frames.
• OpenIE5 (Saha et al., 2017(Saha et al., , 2018) ) 18 is one of the state-of-the-art OIE systems, which is the successor to OLLIE, and it improves extractions from noun relations, numerical sentences, and conjunctive sentences depending on the linguistic patterns.
• Stanford OpenIE (Angeli et al., 2015) 19 leverages POS tag and dependency parser, and generates self-contained clauses from long sentences to extract the triples.

B The TAC KBP-OIE and Wikidata-OIE Datasets
We show samples of our zero-shot OIE extractions and the gold triples on both TAC KBP-OIE and Wikidata-OIE datasets.

B.1 TAC KBP-OIE OIE Extractions and Gold Extractions
We randomly sample 100 documents from the TAC KBP-OIE corpus, then sample sentences from those documents.The uncurated triples and the corresponding gold triples of the sampled sentences based on our best methods BERT LARGE and GPT-2 XL are shown in Figure 4 and Figure 5 respectively.We also randomly sample sentences in which BERT LARGE differs from GPT-2 XL in the resulting triples for comparison, which are illustrated in Figure 6.In each

SF13_ENG_089
Rosoboronexport is the only company in Russia that is allowed to export arms , dual -use products and military -related services .(Rosoboronexport, is the only company in, Russia) → (Rosoboronexport, org:country_of_headquarters, Russia) SF13_ENG_091 With his wife , Cornelie , Middelhoff invested money in 2000 and 2001 with Esch in funds that were formed to buy five properties from KarstadtQuelle , as Arcandor was then known , and leased back to the department store chain before Middelhoff joined the company , according to Middelhoff ' s spokesman .

SF13_ENG_011
The blueblood scion of one of America's most illustrious families appeared to listen impassively as verdicts finding him guilty on 14 counts of grand larceny, conspiracy and fraud were read to a packed courtroom.

Figure 1 :
Figure 1: Summary of our approach.The zero-shot open in-

Figure 2 :
Figure 2: Illustration of predicate extraction with a pre-trained language model (LM).The upper part of (a) represents the general search steps of producing the triple (Dylan; born in; Minnesota) from the input "DylanNP was born in MinnesotaNP" encoded with argument noun phrases (NP).The lower portion shows the corresponding step-by-step process.(b) shows the attention scores generated through the forward pass of the LM over the corresponding input.

F1
Figure 3.We randomly sample 20% of the oracle query entities (provided by TAC KBP) as a hold-out

Figure 3 :
Figure 3: Parameter study with BERT BASE on TAC KBP-OIE hold-out subset.

Table 1 :
Evaluation of unsupervised entity linking of Wikidata-OIE on AIDA benchmark.An asterisk (*) indicates a supervised method.

Table 3 :
Compare the quality of different OIE systems.An asterisk (*) indicates a supervised method.
Input: 0 ,arg 1 ) ← TOP(k, T (arg 0 ,arg 1 ) )• Beam Search.The inputs of the search algorithm are an argument pair (arg 0 , arg 1 ), a sentence s, an attention matrix A s of s.Both arg 0 and arg 1 are identified by the noun chunker in s.A s is the attention matrix associated with s from the forward pass of an LM without fine-tuning.The search gets started by adding the first argument arg 0 as the initial candidate in the beam.While there are still new candidates waiting to be yielded, the search continues, and the top k candidates sorted by the attention scores are maintained in the beam.The details of the proposed beam search are described in Algorithm 1.In practice, we implement an action manager O to decide which action to take at each step.Given a candidate c in the beam, O(c) = START always happens at the beginning of the search.If c has not reached the second argument arg 1 yet, O(c) = YIELD.Otherwise, O(c) = STOP.distributed servers to run.Each server is configured with four Tesla K80 12Gs.We set the max sequence length to 256, and batch size as 32 for BERT LARGE and 4 for GPT-2 XL .We use implementations of pre-trained LMs in the

table ,
Similar to TAC KBP-OIE, Figure7and Figure8show the uncurated triples and the corresponding gold triples of the sampled sentences based on our zero-shot systems BERT LARGE and GPT-2 XL respectively.Figure9illustrates the randomly sampled sentences in which BERT LARGE extracts different triples compared to that from GPT-2 XL .In each table, "ID" represents the Wikipedia page's title of the sampled sentence."Sentence" indicates the sampled sentence."Triples to gold triples" column contains the triples (on the left side of "→") and their corresponding gold triples (on the right side of "→").American spokesman Adam Gadahn , also known as Azzam the American , called on Muslims in the West on Sunday to carry out more attacks like the deadly shooting at the US base in Fort Hood , Texas .Al Qaida and Taliban who operate across borders and have more and more sophisticated means of violence , are becoming bigger and bigger challenges to the international system , " said Bates Gill , director of the Stockholm International Peace Research Institute .the Swiss Bankers Association , Patrick Odier , told weekly NZZ am Sonntag that Italy and France have shown interest in deals like ones Switzerland signed this week with Germany and Britain .(The Swiss Bankers Association, " Patrick Odier) → (Swiss Bankers Association, org:top_members_employees, Patrick Odier) SF13_ENG_076 The majority of voters in Switzerland , which manages more than 25 percent of the world ' s foreign -held private wealth , support banking secrecy , according to a survey published last month by the Swiss Bankers Association in Basel .(The Swiss Bankers Association, in, Basel) → (Swiss Bankers Association, org:city_of_headquarters, Basel) SF13_ENG_078 " Americans have a right to know the truth --Islam is a religion of intolerance and violence , " said Richard Thompson , legal director of the Thomas More Law Center in Ann Arbor ." (The Thomas More Law Center, in, Ann Arbor) → (Thomas More Law Center, org:city_of_headquarters, Ann Arbor) SF13_ENG_082 New solutions may be enacted for these orphans , though , said Mary Robinson , CEO of the National Council for Adoption .(The National Council, " Mary Robinson) → (National Council for Adoption, org:top_members_employees, Mary Robinson) SF13_ENG_082 " When you close a country , you end up causing more problems than you prevented , " said Chuck Johnson , CEO of the National (Neal, returned to, New York) → (Patricia Neal, per:statesorprovinces_of_residence, New York) SF13_ENG_026In 1953 , she married Roald Dahl , the British writer famed for " Charlie and the Chocolate Factory , " " James and the Giant Peach " and other tales for children .(She, married, Roald Dahl) → (Patricia Neal, per:spouse, Roald Dahl) SF13_ENG_026 Oscar -winning actress Patricia Neal has died of lung cancer at her home on Martha ' s Vineyard , Massachusetts , on Sunday .(Patricia, died of, Lung Cancer) → (Patricia Neal, per:cause_of_death, lung cancer) SF13_ENG_027 Al -Hakim ' s son , Ammar al -Hakim , has been groomed for months to take his father ' s place .(Al-Hakim'S Son, " Ammar Al-Hakim) → (Abdul Aziz Al-Hakim, per:children, Ammar al-Hakim) SF13_ENG_027 Al -Hakim is the head of Supreme Iraqi Islamic Council ( SIIC ) , the largest Shiite party in Iraq .(Al-Hakim, is, Supreme Iraqi Islamic Council) → (Abdul Aziz Al-Hakim, per:employee_or_member_of, Supreme Iraqi Islamic Council) SF13_ENG_027 His former Shiite partners have gathered again to form their own group , the Iraqi National Alliance ( INA ) , which includes the influential Supreme Iraqi Islamic Council ( SIIC ) of Ammar al -Hakim , who succeeded his father Abdul Aziz al -Hakim , who died in a hospital in Iran last month after a long battle with cancer .(Al-Hakim, , who died in, Iran) → (Abdul Aziz Al-Hakim, per:country_of_death, Iran) SF13_ENG_028 " I ' d rather have Sully doing this than some stranger , or some hotshot trying to be the next Billy Mays , " said the guy who actually is the next Billy Mays , his son Billy Mays III .(The Next Billy Mays, his son, Billy Mays Iii) → (Billy Mays, per:children, Billy Mays III) SF13_ENG_029 Fignon continued cycling during and after a stint in the Army , and drew attention in the early 1980s when he managed to keep up with Hinault during a race in which amateurs and professionals rode together .(Fignon, during and after a stint in, The Army) → (Laurent Fignon, per:employee_or_member_of, the Army) SF13_ENG_029 Laurent Patrick Fignon was born in Paris on Aug .12 , 1960 .(Laurent Patrick Fignon, was born in, Paris) → (Laurent Fignon, per:city_of_birth, Paris) SF13_ENG_030 Anderson became the Tigers ' manager in June 1979 and built on a foundation that included Alan Trammell at shortstop , Lou Whitaker at second base , Kirk Gibson in the outfield and Jack Morris on the pitching staff .(Anderson, became, The Tigers' Manager) → (Sparky Anderson, per:title, manager) SF13_ENG_030 In addition to his wife , Carol , Anderson is survived by his sons , Lee and Albert ; his daughter , Shirlee Englebrecht ; and many grandchildren .(Anderson, is survived by his sons, Lee) → (Sparky Anderson, per:children, Albert) SF13_ENG_031 Blake Edwards , a writer and director who was hailed as a Hollywood master of screwball farces and rude comedies like " Victor / Victoria " and the " Pink Panther " movies , died Wednesday night in Santa Monica , Calif .He was 88 .(Blake Edwards, " writer) → (Blake Edwards, per:title, writer) SF13_ENG_032 Hwang , who lives in Seoul under tight police security , has written books and delivered speeches condemning Kim ' s regime as authoritarian and dictatorial .(Hwang, who lives, Seoul) → (Hwang Jang-Yop, per:cities_of_residence, Seoul) SF13_ENG_035 Kaczynska , who was 67 , married Kaczynski in 1978 after meeting him in the northern Polish city of Gdansk , where they were both academics .(Kaczynska, married, Kaczynski) → (Maria Kaczynska, per:spouse, Kaczynski) SF13_ENG_036 Upon his release he went into exile in India , where he masterminded the 1973 hijacking of a Royal Nepal Airlines plane known to be carrying hundreds of thousands of dollars in cash to fund his banned Nepali Congress party .(His Release, exile, India) → (Girija Prasad Koirala, per:countries_of_residence, India) SF13_ENG_036 Koirala began his political career as a union organiser and was imprisoned for seven years in 1960 after a failed uprising against the monarchy .(Koirala, began his, A Union Organiser) → (Girija Prasad Koirala, per:title, union organiser) SF13_ENG_036 Koirala was born in 1925 in Bihar of India where his father Krishna Prasad Koirala and his family were living in exile .(Koirala, was born in 1925 in, Bihar) → (Girija Prasad Koirala, per:city_of_birth, Bihar) SF13_ENG_036 Koirala was born in 1925 in Bihar of India at the time when his father Krishna Prasad Koirala along with his family was exiled by Rana rulers .(Koirala, was born in 1925 in, Bihar) → (Girija Prasad Koirala, per:city_of_birth, Bihar) SF13_ENG_037 Chabrol ' s survivors also include his third wife , Aurore Pajot , who acted as his script supervisor on nearly all of his movies from 1968 on and whom he married in 1981 ; and Pajot ' s daughter , Cecile Maistre , who was an assistant director on his films and wrote the script with him for " The Girl Cut in Two " ( 2007 ) .(Chabrol'S Survivors, third wife, Aurore Pajot) → (Claude Chabrol, per:spouse, Aurore Pajot) SF13_ENG_038 The joint statement said Cunningham was " an inspiring performer and dancer into his 80s , and a visionary choreographer and dedicated teacher throughout his life , he led quietly and by example , " the statement said ." (Cunningham, was ", An Inspiring Performer) → (Merce Cunningham, per:title, performer) SF13_ENG_038 Merce Cunningham , the nonagenarian choreographer , is planning for a world without him .(Merce Cunningham, " The Nonagenarian Choreographer) → (Merce Cunningham, per:title, choreographer) SF13_ENG_039 A court on Monday cleared the widower of British reality television star Jade Goody , who died of cancer last year , of rape .(British Reality Television Star Jade, who died, Cancer) → (Jade Goody, per:cause_of_death, cancer) SF13_ENG_041 Don Hewitt , the CBS newsman who invented the highly popular TV newsmagazine " 60 Minutes " and produced it for 36 years , died Wednesday .(Don Hewitt, " The Cbs Newsman) → (Don Hewitt, per:title, newsman) SF13_ENG_041 " He was the consummate television newsman , " Don Hewitt , a longtime CBS News executive and creator of the long -running " 60 Minutes " news program , told Reuters ." (Don Hewitt, " executive) → (Don Hewitt, per:title, executive) SF13_ENG_041 Hewitt was already a highly respected TV newsman .(Hewitt, was, A Highly Respected Tv Newsman) → (Don Hewitt, per:title, TV newsman) SF13_ENG_041 Donald Shepard Hewitt was born in New York on Dec .14 , 1922 , and grew up in the suburb of New Rochelle .(Donald Shepard Hewitt, born, New York) → (Don Hewitt, per:stateorprovince_of_birth, New York) SF13_ENG_043 Eleanor Louise Greenwich was born in Brooklyn on Oct .23 , 1940 .(Eleanor Louise Greenwich, was born, Oct.) → (Ellie Greenwich, per:date_of_birth, 1940-10-23) SF13_ENG_044 A little more than a year after Dunne died from bladder cancer , the colorful remnants of his estate have been consigned by his family to Stair Galleries in Hudson , N .Y ., which will auction them Nov .20 .(Dunne, died, Bladder Cancer) → (Dominick Dunne, per:cause_of_death, bladder cancer) SF13_ENG_048 Charles Gwathmey , an architect known for his influential modernist home designs and famous clients like director Steven Spielberg , has died .(Charles Gwathmey, " architect) → (Charles Gwathmey, per:title, architect) SF13_ENG_049 Besides his wife , Mandelbrot is survived by two sons , Laurent , of Paris , and Didier , of Newton , Mass ., and three grandchildren .(Mandelbrot, is survived by two sons " Laurent) → (Benoit Mandelbrot, per:children, Laurent) SF13_ENG_049 For years , he worked for IBM in New York .(He, for, Ibm) → (Benoit Mandelbrot, per:employee_or_member_of, IBM) SF13_ENG_049 After several years spent largely at the Centre National de la Recherche Scientifique in Paris , Mandelbrot was hired by IBM in 1958 to work at the Thomas J .Watson Research Center in Yorktown Heights , N .Y .Although he worked frequently with academic researchers and served as a visiting professor at Harvard and the Massachusetts Institute of Technology , it was not until 1987 that he began to teach at Yale , where he earned tenure in 1999 .(Mandelbrot, was hired by, Ibm) → (Benoit Mandelbrot, per:employee_or_member_of, IBM) SF13_ENG_056 " It ' s an issue for everybody in the state because peanuts are a big part of our economy , " said Don Koehler , executive director of the Georgia Peanut Commission ." (The Georgia Peanut Commission, director, Don Koehler) → (Georgia Peanut Commission, org:top_members_employees, Don Koehler) SF13_ENG_060 " We ' ll be meeting with scientists , university and science policy officials to explore practical opportunities for exchange and collaboration , " Agre , the AAAS president , was quoted as saying .(The Aaas President, " Agre) → (American Association for the Advancement of Science, org:top_members_employees, Peter C. Agre) SF13_ENG_060 However , Alan Leshner , chief executive officer of the American Association for the Advancement of Science , noted that Nobels are generally given for work that ' s a decade old or more , and that the U .S .mustn ' t become complacent .(The American Association, chief executive, Alan Leshner) → (American Association for the Advancement of Science, org:top_members_employees, Alan Leshner) SF13_ENG_062 " First of all , they never have enough funding , " said Andy Kunz , president of the U .S .High Speed Rail Association , a nonprofit that advocates a national high -speed rail network ." (The U.S. High Speed Rail Association, " Andy Kunz) → (U.S. High Speed Rail Association, org:top_members_employees, Andy Kunz) SF13_ENG_064 China ' s shock at NATO ' s military campaign in the former Yugoslavia helped prod Beijing into playing a bigger role in U .N .peacekeeping , said Bates Gill , director of the Stockholm International Peace Research Institute and co -author of a recent report on China ' s peacekeeping activities .(The Stockholm International Peace Research Institute, director of, Bates Gill) → (Stockholm International Peace Research Institute, org:top_members_employees, Bates Gill) SF13_ENG_064 " Non -state actors , for example , a small group of pirates off the coast of Somalia , American spokesman Adam Gadahn, also known as Azzam the American, called on Muslims in the West on Sunday to carry out more attacks like the deadly shooting at the US base in Fort Hood, Texas.(AdamGadahn,,alsoknown,Azzam)→(Adam Gadahn, per:alternate_names, Azzam) SF13_ENG_012 Gadahn, also known as Azzam the American, was born in 1978.(Gadahn, also known as, Azzam) → (Adam Gadahn, per:alternate_names, Azzam) SF13_ENG_012Gadahn grew up in California and converted to Islam before he moved to Pakistan in 1998 and attended an al-Qaida training camp six years later, according to media reports.(Gadahn,in,California)→(Adam Gadahn, per:statesorprovinces_of_residence, California) SF13_ENG_012 Gadahn moved to Pakistan in 1998, according to the FBI, and is said to have attended an al-Qaida training camp six years later, serving as a translator and consultant for the group.(Gadahn,movedto,Pakistan)→ (Adam Gadahn, per:countries_of_residence, Pakistan) SF13_ENG_014 After the Munich attack, he lived in Lebanon, Jordan and several Eastern European countries, where he had close ties to Communist bloc intelligence agencies.(He,livedin,Lebanon)→ (Mohammed Oudeh, per:countries_of_residence, Lebanon) SF13_ENG_015 Clifton attended Howard University but left before graduating to pursue poetry.(Clifton,attended,HowardUniversity)→ (Lucille Clifton, per:schools_attended, Howard University) SF13_ENG_017 "Alexander Haig devoted his career to serving our country, both as a soldier and as a diplomat," Albright said."(AlexanderHaig,as,soldier) → (Alexander Haig, per:title, soldier) SF13_ENG_017 In 1979, he resigned and retired from the Army.(He,resignedand,TheArmy) → (Alexander Haig, per:employee_or_member_of, Army) SF13_ENG_019 McGregor is survived by his wife, Lori, and four children, daughters Jordan, Taylor and Landri, and a son, Logan.(Mcgregor, by his wife, Lori) → (Keli McGregor, per:spouse, Lori) SF13_ENG_020 "Mike was a first-rate journalist, a valued member of our staff for 25 years and we will miss him," Times Editor Russ Stanton said."(Mike,was,AFirst-Rate Journalist) → (Mike Penner, per:title, first-rate journalist) SF13_ENG_020 Penner is survived by his brother, John, a copy editor at the Times, and his former wife, Times sportswriter Lisa Dillman.(Penner, by his brother, John) → (Mike Penner, per:siblings, John) SF13_ENG_024 She was charged with theft in Beaumont, Texas, for allegedly failing to pay for $10,000 worth of dental work in 2006.(She, was charged with, Theft) → (Crystal Taylor, per:charges, theft) SF13_ENG_025 Michigan native Nancy Kissel was convicted of murder and sentenced in Hong Kong's High Court in September 2005.(MichiganNativeNancyKissel,convicted of, Murder) → (Nancy Kissel, per:charges, murder) SF13_ENG_026 Neal returned to New York and concentrated on stage work.(Neal,returned,NewYork)→ (Patricia Neal, per:statesorprovinces_of_residence, New York) SF13_ENG_026 In 1953, she married Roald Dahl, the British writer famed for "Charlie and the Chocolate Factory," "James and the Giant Peach" and other tales for children.(She,married,RoaldDahl)→ (Patricia Neal, per:spouse, Roald Dahl) SF13_ENG_026 Oscar-winning actress Patricia Neal has died of lung cancer at her home on Martha's Vineyard, Massachusetts, on Sunday.(Neal,hasdiedof,Lung Cancer) → (Patricia Neal, per:cause_of_death, lung cancer) SF13_ENG_027 Al-Hakim's son, Ammar al-Hakim, has been groomed for months to take his father's place.(Al-Hakim'SSon,"AmmarAl-Hakim) → (Abdul Aziz Al-Hakim, per:children, Ammar al-Hakim) SF13_ENG_027 Al-Hakim is the head of Supreme Iraqi Islamic Council (SIIC), the largest Shiite party in Iraq.(Al-Hakim, of, Supreme Iraqi Islamic Council) → (Abdul Aziz Al-Hakim, per:employee_or_member_of, Supreme Iraqi Islamic Council) SF13_ENG_027 His former Shiite partners have gathered again to form their own group, the Iraqi National Alliance (INA), which includes the influential Supreme Iraqi Islamic Council (SIIC) of Ammar al-Hakim, who succeeded his father Abdul Aziz al-Hakim, who died in a hospital in Iran last month after a long battle with cancer.(Al-Hakim,died in a, Iran) → (Abdul Aziz Al-Hakim, per:country_of_death, Iran) SF13_ENG_028 "I'd rather have Sully doing this than some stranger, or some hotshot trying to be the next Billy Mays," said the guy who actually is the next Billy Mays, his son Billy Mays III.(The Next Billy Mays, , his son, Billy Mays Iii) → (Billy Mays, per:children, Billy Mays III) SF13_ENG_029 Fignon continued cycling during and after a stint in the Army, and drew attention in the early 1980s when he managed to keep up with Hinault during a race in which amateurs and professionals rode together.(Fignon,duringandaftera stint in, The Army) → (Laurent Fignon, per:employee_or_member_of, the Army) SF13_ENG_029 Laurent Patrick Fignon was born in Paris on Aug. 12, 1960.(LaurentPatrickFignon, was born in, Paris) → (Laurent Fignon, per:city_of_birth, Paris) SF13_ENG_030 Anderson became the Tigers' manager in June 1979 and built on a foundation that included Alan Trammell at shortstop, Lou Whitaker at second base, Kirk Gibson in the outfield and Jack Morris on the pitching staff.(Anderson,became,TheTigers'Manager) → (Sparky Anderson, per:title, manager) SF13_ENG_030 In addition to his wife, Carol, Anderson is survived by his sons, Lee and Albert; his daughter, Shirlee Englebrecht; and many grandchildren.(Anderson, is survived by his sons, Lee) → (Sparky Anderson, per:children, Lee) SF13_ENG_031 Blake Edwards, a writer and director who was hailed as a Hollywood master of screwball farces and rude comedies like "Victor/Victoria" and the "Pink Panther" movies, died Wednesday night in Santa Monica, Calif.He was 88. (Blake Edwards, " writer) → (Blake Edwards, per:title, writer) SF13_ENG_032 Hwang, who lives in Seoul under tight police security, has written books and delivered speeches condemning Kim's regime as authoritarian and dictatorial.(Hwang,,wholives,Seoul) → (Hwang Jang-Yop, per:cities_of_residence, Seoul) SF13_ENG_035 Kaczynska, who was 67, married Kaczynski in 1978 after meeting him in the northern Polish city of Gdansk, where they were both academics.(Kaczynska,married,Kaczynski)→(Maria Kaczynska, per:spouse, Kaczynski) SF13_ENG_036 Upon his release he went into exile in India, where he masterminded the 1973 hijacking of a Royal Nepal Airlines plane known to be carrying hundreds of thousands of dollars in cash to fund his banned Nepali Congress party.(HisRelease, he went into, India) → (Girija Prasad Koirala, per:countries_of_residence, India) SF13_ENG_036 Koirala began his political career as a union organiser and was imprisoned for seven years in 1960 after a failed uprising against the monarchy.(Koirala,careeras,AUnion Organiser) → (Girija Prasad Koirala, per:title, union organiser) SF13_ENG_036 Koirala was born in 1925 in Bihar of India where his father Krishna Prasad Koirala and his family were living in exile.(Koirala,wasborn in 1925 in, Bihar) → (Girija Prasad Koirala, per:city_of_birth, Bihar) SF13_ENG_036 Koirala was born in 1925 in Bihar of India at the time when his father Krishna Prasad Koirala along with his family was exiled by Rana rulers.(Koirala,born in 1925, Bihar) → (Girija Prasad Koirala, per:city_of_birth, Bihar) SF13_ENG_037 Chabrol's survivors also include his third wife, Aurore Pajot, who acted as his script supervisor on nearly all of his movies from 1968 on and whom he married in 1981; and Pajot's daughter, Cecile Maistre, who was an assistant director on his films and wrote the script with him for "The Girl Cut in Two" (2007).spentlargelyat the Centre National de la Recherche Scientifique in Paris, Mandelbrot was hired by IBM in 1958 to work at the Thomas J. Watson Research Center in Yorktown Heights, N.Y.Although he worked frequently with academic researchers and served as a visiting professor at Harvard and the Massachusetts Institute of Technology, it was not until 1987 that he began to teach at Yale, where he earned tenure in 1999.Alan Leshner, chief executive officer of the American Association for the Advancement of Science, noted that Nobels are generally given for work that's a decade old or more, and that the U.S. mustn't become complacent.(TheAmericanAssociation,"Alan Leshner) → (American Association for the Advancement of Science, org:top_members_employees, Alan Leshner) SF13_ENG_060 Norman Augustine, the former chairman and chief executive of the Lockheed Martin Corporation; Patricia Goldman, former vice chairman of the National Transportation Safety Board; Mary Good, a former president of the American Association for the Advancement of Science; Roger Martin, dean of the Rotman School of Management; Brian O'Neill, a former president of the Insurance Institute for Highway Safety; and Sheila Widnall, a professor at the Massachusetts Institute of Technology.stateactors, for example, a small group of pirates off the coast of Somalia, Al Qaida and Taliban who operate across borders and have more and more sophisticated means of violence, are becoming bigger and bigger challenges to the international system," said Bates Gill, director of the Stockholm International Peace Research Institute.theSwissBankersAssociation,Patrick Odier, told weekly NZZ am Sonntag that Italy and France have shown interest in deals like ones Switzerland signed this week with Germany and Britain.(TheSwissBankersAssociation," Patrick Odier) → (Swiss Bankers Association, org:top_members_employees, Patrick Odier) SF13_ENG_076 The majority of voters in Switzerland, which manages more than 25 percent of the world's foreign-held private wealth, support banking secrecy, according to a survey published last month by the Swiss Bankers Association in Basel.(TheSwissBankersAssociation,in, Basel) → (Swiss Bankers Association, org:city_of_headquarters, Basel) SF13_ENG_078 "Americans have a right to know the truth -Islam is a religion of intolerance and violence," said Richard Thompson, legal director of the Thomas More Law Center in Ann Arbor."(TheThomasMoreLaw Center, in, Ann Arbor) → (Thomas More Law Center, org:city_of_headquarters, Ann Arbor) SF13_ENG_082 New solutions may be enacted for these orphans, though, said Mary Robinson, CEO of the National Council for Adoption.(TheNationalCouncil,of,Mary Robinson) → (National Council for Adoption, org:top_members_employees, Mary Robinson) SF13_ENG_082 "When you close a country, you end up causing more problems than you prevented," said Chuck Johnson, CEO of the National Hakim, who died Wednesday of lung cancer in Tehran, was a symbol for many of the re-emergence of Iraq's Shiite political majority after decades of oppression under Saddam Hussein's Sunni-led regime.-(Al-Hakim,whodied,Tehran)→ (Abdul Aziz Al-Hakim, per:city_of_death, Tehran) SF13_ENG_028 TAMPA Heart disease , exacerbated by cocaine use , killed celebrated TV pitchman Billy Mays , according to the Hillsborough County medical examiner ' s final autopsy report released Friday .(CelebratedTvPitchmanBillyMays, killed, Tampa) → (Billy Mays, per:cause_of_death, TAMPA Heart disease) -SF13_ENG_030 Family spokesman Dan Ewald said Anderson died from complications from dementia .(Anderson,diedfrom,Dementia)→ (Sparky Anderson, per:cause_of_death, dementia) -SF13_ENG_031 He served in the Coast Guard during World War II and had a rare leading role in the low -budget thriller " Strangler of the Swamp " ( 1946 ) .(He, in, The Coast Guard) → (Blake Edwards, per:employee_or_member_of, Coast Guard) -SF13_ENG_031 His agent Lou Pitt confirmed he had died , while the " Entertainment Tonight " news show and website reported that he died in Santa Monica on Wednesday night from complications of pneumonia , with Andrews at his side .(He,died,Complications)→(Blake Edwards, per:cause_of_death, complications) -SF13_ENG_033In a career that spanned seven decades, Ginzburg authored several groundbreaking studies in various fields -such as quantum theory, astrophysics, radio-astronomy and diffusion of cosmic radiation in the Earth's atmosphere -that were of "Nobel Prize caliber," said Gennady Mesyats, the director of the Lebedev Physics Institute in Moscow, where Ginzburg worked.-(Ginzburg,in, The Lebedev Physics Institute) → (Vitaly Ginzburg, per:employee_or_member_of, Lebedev Physics Institute) SF13_ENG_035 Kaczynska , who died aged 59 , married Kaczynski in 1978 .(Kaczynska, married, Kaczynski) → (Maria Kaczynska, per:spouse, Kaczynski) -SF13_ENG_037 Prime Minister Francois Fillon called Chabrol a " great director , producer and screenwriter ( who ) was one of the grand figures of the ' Nouvelle vague , ' which revolutionized the style and techniques of cinema by looking at real experience , true life , that which is indiscreet and subtle ." (Afp Chabrol, was, director) → (Claude Chabrol, per:title, film director); (Chabrol, director, Producer) → (Claude Chabrol, per:title, producer) as an arranger and singer , a role that saw her working with artists including Frank Sinatra and Ella Fitzgerald .(Greenwich, worked as an, Singer) → (Ellie Greenwich, per:title, singer) -SF13_ENG_044 Dunne was born in 1925 in Hartford, Connecticut, to a wealthy Roman Catholic family and grew up in some of the same social circles as the Kennedys.-(Dunne, was born in 1925 in, Connecticut) → (Dominick Dunne, per:stateorprovince_of_birth, Connecticut) SF13_ENG_044 In 1957 , Dunne moved to Los Angeles to work on the CBS showcase " Playhouse 90 ." (Dunne, moved to, Los Angeles) → (Dominick Dunne, per:cities_of_residence, Los Angeles) -SF13_ENG_044 Dunne found his greatest prominence as a celebrity journalist while covering the 1995 murder trial of football star and actor O.J. Simpson, who had been accused of killing his ex-wife Nicole Brown Simpson and her friend Ronald Goldman.-(Dunne, as, A Celebrity Journalist) → (Dominick Dunne, per:title, celebrity journalist) SF13_ENG_045 Goldstein died in New York on Aug .28 .(Goldstein, died, Aug.) → (Adam Goldstein, per:date_of_death, XXXX-08-28) -SF13_ENG_046 After acting in movies and television , Meredith receded into a quiet life in Santa Fe , writing , painting , golfing and acting in a stage production of " The Odd Couple ." (Meredith, in, Santa Fe) → (Don Meredith, per:cities_of_residence, Santa Fe) -SF13_ENG_046 In high school and at Southern Methodist University , where , already known as Dandy Don ( a nickname bestowed on him by his brother ) , Meredith became an all -American .(Dandy Don, where " Southern Methodist University) → (Don Meredith, per:schools_attended, Southern Methodist University) -SF13_ENG_046 He spent much of his life backing away from the nickname Dandy Don , particularly during his secluded later decades in New Mexico .(He, the nickname, Dandy Don) → (Don Meredith, per:alternate_names, Dandy Don) -SF13_ENG_048 The group, known variously as the Five or the Whites (for the color of most of their buildings) or the New York School, consisted of Gwathmey, Michael Graves, Eisenman, John Hejduk and Richard Meier.-(Gwathmey, of, The New York School) → (Charles Gwathmey, per:employee_or_member_of, New York School) SF13_ENG_048 Meier , who said he had known Gwathmey for 50 years , has particularly fond memories from the time when Gwathmey was first courting his second wife , Bette -Ann Damson , and they all picked corn for dinner in a field adjacent to a barn Meier was renting on the East End of Long Island .(Gwathmey, his second wife " Bette-Ann Damson) → (Charles Gwathmey, per:spouse, Bette-Ann Damson) -SF13_ENG_057 By 2013, automakers will have dozens of plug-in electric hybrid vehicles and fully electric vehicles, said Jason Forcier, a vice president at battery maker A123 Systems Inc. -(A123 Systems Inc., a vice president, Jason Forcier) → (A123 Systems Inc., org:top_members_employees, Jason Forcier) SF13_ENG_058 Access Industries , a privately held company founded in 1986 by Len Blavatnik , has a diverse portfolio of investments in industry , real estate , media and telecommunications .(Access Industries, founded in 1986 by, Len Blavatnik) → (Access Industries, org:founded_by, Len Blavatnik) -SF13_ENG_069 &quot;The indications are positive,&quot; said Vincent Cogliano, director of the Monographs program at IARC, which decides on carcinogen classifications.-(Iarc, , director of the, Vincent Cogliano) → (International Agency for Research on Cancer, org:top_members_employees, Vincent Cogliano) SF13_ENG_082 "There hasn't been a concerted push to open doors for Muslim orphans because the expectation would be that those efforts would fall flat," said Chuck Johnson, chief executive of the National Council for Adoption, a policy group in Alexandria, Va.-(Adoption, in, Alexandria) → (National Council for Adoption, org:city_of_headquarters, Alexandria) SF13_ENG_084 " The scenario itself is secret , " said Eileen McMenamin , vice president of communications for the Bipartisan Policy Center ( BPC ) , which is hosting the event dubbed " Cyber ShockWave ." (The Bipartisan Policy Center, , vice president of, Eileen Mcmenamin) → (Bipartisan Policy Center, org:top_members_employees, Eileen McMenamin) -SF13_ENG_088 Bank Julius Baer Co., based in Basel, Switzerland, sued because WikiLeaks posted accountholder information from its Cayman outpost amid allegations of money laundering and tax evasion.-(Bank Julius Baer Co., based, Basel) → (Bank Julius Baer, org:city_of_headquarters, Basel) SF13_ENG_094 The other manufacturers are two US companies, L-3 Communications and Rapiscan Systems, a unit of OSI Systems, and British rival Smiths Detection.