Pre-training Mention Representations in Coreference Models

Yuval Varkel, Amir Globerson


Abstract
Collecting labeled data for coreference resolution is a challenging task, requiring skilled annotators. It is thus desirable to develop coreference resolution models that can make use of unlabeled data. Here we provide such an approach for the powerful class of neural coreference models. These models rely on representations of mentions, and we show these representations can be learned in a self-supervised manner towards improving resolution accuracy. We propose two self-supervised tasks that are closely related to coreference resolution and thus improve mention representation. Applying this approach to the GAP dataset results in new state of the arts results.
Anthology ID:
2020.emnlp-main.687
Volume:
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Month:
November
Year:
2020
Address:
Online
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8534–8540
Language:
URL:
https://aclanthology.org/2020.emnlp-main.687
DOI:
10.18653/v1/2020.emnlp-main.687
Bibkey:
Cite (ACL):
Yuval Varkel and Amir Globerson. 2020. Pre-training Mention Representations in Coreference Models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 8534–8540, Online. Association for Computational Linguistics.
Cite (Informal):
Pre-training Mention Representations in Coreference Models (Varkel & Globerson, EMNLP 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.emnlp-main.687.pdf
Video:
 https://slideslive.com/38939381