Automatic Error Analysis for Document-level Information Extraction

Aliva Das, Xinya Du, Barry Wang, Kejian Shi, Jiayuan Gu, Thomas Porter, Claire Cardie


Abstract
Document-level information extraction (IE) tasks have recently begun to be revisited in earnest using the end-to-end neural network techniques that have been successful on their sentence-level IE counterparts. Evaluation of the approaches, however, has been limited in a number of dimensions. In particular, the precision/recall/F1 scores typically reported provide few insights on the range of errors the models make. We build on the work of Kummerfeld and Klein (2013) to propose a transformation-based framework for automating error analysis in document-level event and (N-ary) relation extraction. We employ our framework to compare two state-of-the-art document-level template-filling approaches on datasets from three domains; and then, to gauge progress in IE since its inception 30 years ago, vs. four systems from the MUC-4 (1992) evaluation.
Anthology ID:
2022.acl-long.274
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3960–3975
Language:
URL:
https://aclanthology.org/2022.acl-long.274
DOI:
10.18653/v1/2022.acl-long.274
Bibkey:
Cite (ACL):
Aliva Das, Xinya Du, Barry Wang, Kejian Shi, Jiayuan Gu, Thomas Porter, and Claire Cardie. 2022. Automatic Error Analysis for Document-level Information Extraction. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3960–3975, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Automatic Error Analysis for Document-level Information Extraction (Das et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.274.pdf
Software:
 2022.acl-long.274.software.zip
Code
 icejinx33/auto-err-template-fill
Data
MUC-4SciREX