Automatic Data Acquisition for Event Coreference Resolution

Prafulla Kumar Choubey, Ruihong Huang


Abstract
We propose to leverage lexical paraphrases and high precision rules informed by news discourse structure to automatically collect coreferential and non-coreferential event pairs from unlabeled English news articles. We perform both manual validation and empirical evaluation on multiple evaluation datasets with different event domains and text genres to assess the quality of our acquired event pairs. We found that a model trained on our acquired event pairs performs comparably as the supervised model when applied to new data out of the training data domains. Further, augmenting human-annotated data with the acquired event pairs provides empirical performance gains on both in-domain and out-of-domain evaluation datasets.
Anthology ID:
2021.eacl-main.101
Volume:
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Month:
April
Year:
2021
Address:
Online
Editors:
Paola Merlo, Jorg Tiedemann, Reut Tsarfaty
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1185–1196
Language:
URL:
https://aclanthology.org/2021.eacl-main.101
DOI:
10.18653/v1/2021.eacl-main.101
Bibkey:
Cite (ACL):
Prafulla Kumar Choubey and Ruihong Huang. 2021. Automatic Data Acquisition for Event Coreference Resolution. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 1185–1196, Online. Association for Computational Linguistics.
Cite (Informal):
Automatic Data Acquisition for Event Coreference Resolution (Choubey & Huang, EACL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.eacl-main.101.pdf
Code
 prafulla77/event-coref-eacl-2021
Data
ECB+