ExtEnD: Extractive Entity Disambiguation

Edoardo Barba, Luigi Procopio, Roberto Navigli


Abstract
Local models for Entity Disambiguation (ED) have today become extremely powerful, in most part thanks to the advent of large pre-trained language models. However, despite their significant performance achievements, most of these approaches frame ED through classification formulations that have intrinsic limitations, both computationally and from a modeling perspective. In contrast with this trend, here we propose ExtEnD, a novel local formulation for ED where we frame this task as a text extraction problem, and present two Transformer-based architectures that implement it. Based on experiments in and out of domain, and training over two different data regimes, we find our approach surpasses all its competitors in terms of both data efficiency and raw performance. ExtEnD outperforms its alternatives by as few as 6 F1 points on the more constrained of the two data regimes and, when moving to the other higher-resourced regime, sets a new state of the art on 4 out of 4 benchmarks under consideration, with average improvements of 0.7 F1 points overall and 1.1 F1 points out of domain. In addition, to gain better insights from our results, we also perform a fine-grained evaluation of our performances on different classes of label frequency, along with an ablation study of our architectural choices and an error analysis. We release our code and models for research purposes at https://github.com/SapienzaNLP/extend.
Anthology ID:
2022.acl-long.177
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2478–2488
Language:
URL:
https://aclanthology.org/2022.acl-long.177
DOI:
10.18653/v1/2022.acl-long.177
Bibkey:
Cite (ACL):
Edoardo Barba, Luigi Procopio, and Roberto Navigli. 2022. ExtEnD: Extractive Entity Disambiguation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2478–2488, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
ExtEnD: Extractive Entity Disambiguation (Barba et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.177.pdf
Code
 sapienzanlp/extend
Data
AIDA CoNLL-YAGO