Model-based Annotation of Coreference

Rahul Aralikatte, Anders Søgaard


Abstract
Humans do not make inferences over texts, but over models of what texts are about. When annotators are asked to annotate coreferent spans of text, it is therefore a somewhat unnatural task. This paper presents an alternative in which we preprocess documents, linking entities to a knowledge base, and turn the coreference annotation task – in our case limited to pronouns – into an annotation task where annotators are asked to assign pronouns to entities. Model-based annotation is shown to lead to faster annotation and higher inter-annotator agreement, and we argue that it also opens up an alternative approach to coreference resolution. We present two new coreference benchmark datasets, for English Wikipedia and English teacher-student dialogues, and evaluate state-of-the-art coreference resolvers on them.
Anthology ID:
2020.lrec-1.9
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
74–79
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.9
DOI:
Bibkey:
Cite (ACL):
Rahul Aralikatte and Anders Søgaard. 2020. Model-based Annotation of Coreference. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 74–79, Marseille, France. European Language Resources Association.
Cite (Informal):
Model-based Annotation of Coreference (Aralikatte & Søgaard, LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.9.pdf
Code
 rahular/model-based-coref
Data
QuACWikiCoref