Knowledge Base Completion for Long-Tail Entities

Lihu Chen, Simon Razniewski, Gerhard Weikum


Abstract
Despite their impressive scale, knowledge bases (KBs), such as Wikidata, still contain significant gaps. Language models (LMs) have been proposed as a source for filling these gaps. However, prior works have focused on prominent entities with rich coverage by LMs, neglecting the crucial case of long-tail entities. In this paper, we present a novel method for LM-based-KB completion that is specifically geared for facts about long-tail entities. The method leverages two different LMs in two stages: for candidate retrieval and for candidate verification and disambiguation. To evaluate our method and various baselines, we introduce a novel dataset, called MALT, rooted in Wikidata. Our method outperforms all baselines in F1, with major gains especially in recall.
Anthology ID:
2023.matching-1.8
Volume:
Proceedings of the First Workshop on Matching From Unstructured and Structured Data (MATCHING 2023)
Month:
July
Year:
2023
Address:
Toronto, ON, Canada
Editors:
Estevam Hruschka, Tom Mitchell, Sajjadur Rahman, Dunja Mladenić, Marko Grobelnik
Venue:
MATCHING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
99–108
Language:
URL:
https://aclanthology.org/2023.matching-1.8
DOI:
10.18653/v1/2023.matching-1.8
Bibkey:
Cite (ACL):
Lihu Chen, Simon Razniewski, and Gerhard Weikum. 2023. Knowledge Base Completion for Long-Tail Entities. In Proceedings of the First Workshop on Matching From Unstructured and Structured Data (MATCHING 2023), pages 99–108, Toronto, ON, Canada. Association for Computational Linguistics.
Cite (Informal):
Knowledge Base Completion for Long-Tail Entities (Chen et al., MATCHING 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.matching-1.8.pdf
Video:
 https://aclanthology.org/2023.matching-1.8.mp4