Evaluation of Coreference Resolution Systems Under Adversarial Attacks

Haixia Chai, Wei Zhao, Steffen Eger, Michael Strube


Abstract
A substantial overlap of coreferent mentions in the CoNLL dataset magnifies the recent progress on coreference resolution. This is because the CoNLL benchmark fails to evaluate the ability of coreference resolvers that requires linking novel mentions unseen at train time. In this work, we create a new dataset based on CoNLL, which largely decreases mention overlaps in the entire dataset and exposes the limitations of published resolvers on two aspects—lexical inference ability and understanding of low-level orthographic noise. Our findings show (1) the requirements for embeddings, used in resolvers, and for coreference resolutions are, by design, in conflict and (2) adversarial approaches are sometimes not legitimate to mitigate the obstacles, as they may falsely introduce mention overlaps in adversarial training and test sets, thus giving an inflated impression for the improvements.
Anthology ID:
2020.codi-1.16
Volume:
Proceedings of the First Workshop on Computational Approaches to Discourse
Month:
November
Year:
2020
Address:
Online
Editors:
Chloé Braud, Christian Hardmeier, Junyi Jessy Li, Annie Louis, Michael Strube
Venue:
CODI
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
154–159
Language:
URL:
https://aclanthology.org/2020.codi-1.16
DOI:
10.18653/v1/2020.codi-1.16
Bibkey:
Cite (ACL):
Haixia Chai, Wei Zhao, Steffen Eger, and Michael Strube. 2020. Evaluation of Coreference Resolution Systems Under Adversarial Attacks. In Proceedings of the First Workshop on Computational Approaches to Discourse, pages 154–159, Online. Association for Computational Linguistics.
Cite (Informal):
Evaluation of Coreference Resolution Systems Under Adversarial Attacks (Chai et al., CODI 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.codi-1.16.pdf