Cross-Document Event Coreference Resolution: Instruct Humans or Instruct GPT?

Jin Zhao, Nianwen Xue, Bonan Min


Abstract
This paper explores utilizing Large Language Models (LLMs) to perform Cross-Document Event Coreference Resolution (CDEC) annotations and evaluates how they fare against human annotators with different levels of training. Specifically, we formulate CDEC as a multi-category classification problem on pairs of events that are represented as decontextualized sentences, and compare the predictions of GPT-4 with the judgment of fully trained annotators and crowdworkers on the same data set. Our study indicates that GPT-4 with zero-shot learning outperformed crowd-workers by a large margin and exhibits a level of performance comparable to trained annotators. Upon closer analysis, GPT-4 also exhibits tendencies of being overly confident, and force annotation decisions even when such decisions are not warranted due to insufficient information. Our results have implications on how to perform complicated annotations such as CDEC in the age of LLMs, and show that the best way to acquire such annotations might be to combine the strengths of LLMs and trained human annotators in the annotation process, and using untrained or undertrained crowdworkers is no longer a viable option to acquire high-quality data to advance the state of the art for such problems.
Anthology ID:
2023.conll-1.38
Volume:
Proceedings of the 27th Conference on Computational Natural Language Learning (CoNLL)
Month:
December
Year:
2023
Address:
Singapore
Editors:
Jing Jiang, David Reitter, Shumin Deng
Venue:
CoNLL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
561–574
Language:
URL:
https://aclanthology.org/2023.conll-1.38
DOI:
10.18653/v1/2023.conll-1.38
Bibkey:
Cite (ACL):
Jin Zhao, Nianwen Xue, and Bonan Min. 2023. Cross-Document Event Coreference Resolution: Instruct Humans or Instruct GPT?. In Proceedings of the 27th Conference on Computational Natural Language Learning (CoNLL), pages 561–574, Singapore. Association for Computational Linguistics.
Cite (Informal):
Cross-Document Event Coreference Resolution: Instruct Humans or Instruct GPT? (Zhao et al., CoNLL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.conll-1.38.pdf