Revisiting Sparse Retrieval for Few-shot Entity Linking

Yulin Chen, Zhenran Xu, Baotian Hu, Min Zhang


Abstract
Entity linking aims to link ambiguous mentions to their corresponding entities in a knowledge base. One of the key challenges comes from insufficient labeled data for specific domains. Although dense retrievers have achieved excellent performance on several benchmarks, their performance decreases significantly when only a limited amount of in-domain labeled data is available. In such few-shot setting, we revisit the sparse retrieval method, and propose an ELECTRA-based keyword extractor to denoise the mention context and construct a better query expression. For training the extractor, we propose a distant supervision method to automatically generate training data based on overlapping tokens between mention contexts and entity descriptions. Experimental results on the ZESHEL dataset demonstrate that the proposed method outperforms state-of-the-art models by a significant margin across all test domains, showing the effectiveness of keyword-enhanced sparse retrieval.
Anthology ID:
2023.emnlp-main.789
Volume:
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
12801–12806
Language:
URL:
https://aclanthology.org/2023.emnlp-main.789
DOI:
10.18653/v1/2023.emnlp-main.789
Bibkey:
Cite (ACL):
Yulin Chen, Zhenran Xu, Baotian Hu, and Min Zhang. 2023. Revisiting Sparse Retrieval for Few-shot Entity Linking. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 12801–12806, Singapore. Association for Computational Linguistics.
Cite (Informal):
Revisiting Sparse Retrieval for Few-shot Entity Linking (Chen et al., EMNLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.emnlp-main.789.pdf
Video:
 https://aclanthology.org/2023.emnlp-main.789.mp4