A Study on Entity Linking Across Domains”:" Which Data is Best for Fine-Tuning?

Hassan Soliman, Heike Adel, Mohamed H. Gad-Elrab, Dragan Milchevski, Jannik Strötgen


Abstract
Entity linking disambiguates mentions by mapping them to entities in a knowledge graph (KG). One important question in today’s research is how to extend neural entity linking systems to new domains. In this paper, we aim at a system that enables linking mentions to entities from a general-domain KG and a domain-specific KG at the same time. In particular, we represent the entities of different KGs in a joint vector space and address the questions of which data is best suited for creating and fine-tuning that space, and whether fine-tuning harms performance on the general domain. We find that a combination of data from both the general and the special domain is most helpful. The first is especially necessary for avoiding performance loss on the general domain. While additional supervision on entities that appear in both KGs performs best in an intrinsic evaluation of the vector space, it has less impact on the downstream task of entity linking.
Anthology ID:
2022.repl4nlp-1.19
Volume:
Proceedings of the 7th Workshop on Representation Learning for NLP
Month:
May
Year:
2022
Address:
Dublin, Ireland
Venues:
ACL | RepL4NLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
184–190
Language:
URL:
https://aclanthology.org/2022.repl4nlp-1.19
DOI:
10.18653/v1/2022.repl4nlp-1.19
Bibkey:
Cite (ACL):
Hassan Soliman, Heike Adel, Mohamed H. Gad-Elrab, Dragan Milchevski, and Jannik Strötgen. 2022. A Study on Entity Linking Across Domains”:" Which Data is Best for Fine-Tuning?. In Proceedings of the 7th Workshop on Representation Learning for NLP, pages 184–190, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
A Study on Entity Linking Across Domains”:” Which Data is Best for Fine-Tuning? (Soliman et al., RepL4NLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.repl4nlp-1.19.pdf