Weiwei Gu


2023

pdf bib
Iterative Document-level Information Extraction via Imitation Learning
Yunmo Chen | William Gantt | Weiwei Gu | Tongfei Chen | Aaron White | Benjamin Van Durme
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

We present a novel iterative extraction model, IterX, for extracting complex relations, or templates, i.e., N-tuples representing a mapping from named slots to spans of text within a document. Documents may feature zero or more instances of a template of any given type, and the task of template extraction entails identifying the templates in a document and extracting each template’s slot values. Our imitation learning approach casts the problem as a Markov decision process (MDP), and relieves the need to use predefined template orders to train an extractor. It leads to state-of-the-art results on two established benchmarks – 4-ary relation extraction on SciREX and template extraction on MUC-4 – as well as a strong baseline on the new BETTER Granular task.

2022

pdf bib
An Empirical Study on Finding Spans
Weiwei Gu | Boyuan Zheng | Yunmo Chen | Tongfei Chen | Benjamin Van Durme
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

We present an empirical study on methods for span finding, the selection of consecutive tokens in text for some downstream tasks. We focus on approaches that can be employed in training end-to-end information extraction systems, and find there is no definitive solution without considering task properties, and provide our observations to help with future design choices: 1) a tagging approach often yields higher precision while span enumeration and boundary prediction provide higher recall; 2) span type information can benefit a boundary prediction approach; 3) additional contextualization does not help span finding in most cases.