EntEval: A Holistic Evaluation Benchmark for Entity Representations

Mingda Chen, Zewei Chu, Yang Chen, Karl Stratos, Kevin Gimpel


Abstract
Rich entity representations are useful for a wide class of problems involving entities. Despite their importance, there is no standardized benchmark that evaluates the overall quality of entity representations. In this work, we propose EntEval: a test suite of diverse tasks that require nontrivial understanding of entities including entity typing, entity similarity, entity relation prediction, and entity disambiguation. In addition, we develop training techniques for learning better entity representations by using natural hyperlink annotations in Wikipedia. We identify effective objectives for incorporating the contextual information in hyperlinks into state-of-the-art pretrained language models (Peters et al., 2018) and show that they improve strong baselines on multiple EntEval tasks.
Anthology ID:
D19-1040
Volume:
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Month:
November
Year:
2019
Address:
Hong Kong, China
Editors:
Kentaro Inui, Jing Jiang, Vincent Ng, Xiaojun Wan
Venues:
EMNLP | IJCNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
421–433
Language:
URL:
https://aclanthology.org/D19-1040/
DOI:
10.18653/v1/D19-1040
Bibkey:
Cite (ACL):
Mingda Chen, Zewei Chu, Yang Chen, Karl Stratos, and Kevin Gimpel. 2019. EntEval: A Holistic Evaluation Benchmark for Entity Representations. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 421–433, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):
EntEval: A Holistic Evaluation Benchmark for Entity Representations (Chen et al., EMNLP-IJCNLP 2019)
Copy Citation:
PDF:
https://aclanthology.org/D19-1040.pdf
Code
 ZeweiChu/EntEval +  additional community code
Data
FEVERWikiSRS