Resources and Evaluations for Danish Entity Resolution

Maria Barrett, Hieu Lam, Martin Wu, Ophélie Lacroix, Barbara Plank, Anders Søgaard


Abstract
Automatic coreference resolution is understudied in Danish even though most of the Danish Dependency Treebank (Buch-Kromann, 2003) is annotated with coreference relations. This paper describes a conversion of its partial, yet well-documented, coreference relations into coreference clusters and the training and evaluation of coreference models on this data. To the best of our knowledge, these are the first publicly available, neural coreference models for Danish. We also present a new entity linking annotation on the dataset using WikiData identifiers, a named entity disambiguation (NED) dataset, and a larger automatically created NED dataset enabling wikily supervised NED models. The entity linking annotation is benchmarked using a state-of-the-art neural entity disambiguation model.
Anthology ID:
2021.crac-1.7
Volume:
Proceedings of the Fourth Workshop on Computational Models of Reference, Anaphora and Coreference
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic
Venues:
CRAC | EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
63–69
Language:
URL:
https://aclanthology.org/2021.crac-1.7
DOI:
10.18653/v1/2021.crac-1.7
Bibkey:
Cite (ACL):
Maria Barrett, Hieu Lam, Martin Wu, Ophélie Lacroix, Barbara Plank, and Anders Søgaard. 2021. Resources and Evaluations for Danish Entity Resolution. In Proceedings of the Fourth Workshop on Computational Models of Reference, Anaphora and Coreference, pages 63–69, Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Resources and Evaluations for Danish Entity Resolution (Barrett et al., CRAC 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.crac-1.7.pdf