Aaron White


2024

pdf bib
MultiMUC: Multilingual Template Filling on MUC-4
William Gantt | Shabnam Behzad | Hannah An | Yunmo Chen | Aaron White | Benjamin Van Durme | Mahsa Yarmohammadi
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)

We introduce MultiMUC, the first multilingual parallel corpus for template filling, comprising translations of the classic MUC-4 template filling benchmark into five languages: Arabic, Chinese, Farsi, Korean, and Russian. We obtain automatic translations from a strong multilingual machine translation system and manually project the original English annotations into each target language. For all languages, we also provide human translations for key portions of the dev and test splits. Finally, we present baselines on MultiMUC both with state-of-the-art template filling models for MUC-4 and with ChatGPT. We release MUC-4 and the supervised baselines to facilitate further work on document-level information extraction in multilingual settings.

2023

pdf bib
Iterative Document-level Information Extraction via Imitation Learning
Yunmo Chen | William Gantt | Weiwei Gu | Tongfei Chen | Aaron White | Benjamin Van Durme
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

We present a novel iterative extraction model, IterX, for extracting complex relations, or templates, i.e., N-tuples representing a mapping from named slots to spans of text within a document. Documents may feature zero or more instances of a template of any given type, and the task of template extraction entails identifying the templates in a document and extracting each template’s slot values. Our imitation learning approach casts the problem as a Markov decision process (MDP), and relieves the need to use predefined template orders to train an extractor. It leads to state-of-the-art results on two established benchmarks – 4-ary relation extraction on SciREX and template extraction on MUC-4 – as well as a strong baseline on the new BETTER Granular task.

pdf bib
A Unified View of Evaluation Metrics for Structured Prediction
Yunmo Chen | William Gantt | Tongfei Chen | Aaron White | Benjamin Van Durme
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

We present a conceptual framework that unifies a variety of evaluation metrics for different structured prediction tasks (e.g. event and relation extraction, syntactic and semantic parsing). Our framework requires representing the outputs of these tasks as objects of certain data types, and derives metrics through matching of common substructures, possibly followed by normalization. We demonstrate how commonly used metrics for a number of tasks can be succinctly expressed by this framework, and show that new metrics can be naturally derived in a bottom-up way based on an output structure. We release a library that enables this derivation to create new metrics. Finally, we consider how specific characteristics of tasks motivate metric design decisions, and suggest possible modifications to existing metrics in line with those motivations.

pdf bib
On Event Individuation for Document-Level Information Extraction
William Gantt | Reno Kriz | Yunmo Chen | Siddharth Vashishtha | Aaron White
Findings of the Association for Computational Linguistics: EMNLP 2023

As information extraction (IE) systems have grown more adept at processing whole documents, the classic task of *template filling* has seen renewed interest as a benchmark for document-level IE. In this position paper, we call into question the suitability of template filling for this purpose. We argue that the task demands definitive answers to thorny questions of *event individuation* — the problem of distinguishing distinct events — about which even human experts disagree. Through an annotation study and error analysis, we show that this raises concerns about the usefulness of template filling metrics, the quality of datasets for the task, and the ability of models to learn it. Finally, we consider possible solutions.