Adapting Coreference Resolution Models through Active Learning

Michelle Yuan, Patrick Xia, Chandler May, Benjamin Van Durme, Jordan Boyd-Graber


Abstract
Neural coreference resolution models trained on one dataset may not transfer to new, low-resource domains. Active learning mitigates this problem by sampling a small subset of data for annotators to label. While active learning is well-defined for classification tasks, its application to coreference resolution is neither well-defined nor fully understood. This paper explores how to actively label coreference, examining sources of model uncertainty and document reading costs. We compare uncertainty sampling strategies and their advantages through thorough error analysis. In both synthetic and human experiments, labeling spans within the same document is more effective than annotating spans across documents. The findings contribute to a more realistic development of coreference resolution models.
Anthology ID:
2022.acl-long.519
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7533–7549
Language:
URL:
https://aclanthology.org/2022.acl-long.519
DOI:
10.18653/v1/2022.acl-long.519
Bibkey:
Cite (ACL):
Michelle Yuan, Patrick Xia, Chandler May, Benjamin Van Durme, and Jordan Boyd-Graber. 2022. Adapting Coreference Resolution Models through Active Learning. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7533–7549, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Adapting Coreference Resolution Models through Active Learning (Yuan et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.519.pdf
Video:
 https://aclanthology.org/2022.acl-long.519.mp4
Code
 forest-snow/incremental-coref