Low-resource Deep Entity Resolution with Transfer and Active Learning

Jungo Kasai, Kun Qian, Sairam Gurajada, Yunyao Li, Lucian Popa


Abstract
Entity resolution (ER) is the task of identifying different representations of the same real-world entities across databases. It is a key step for knowledge base creation and text mining. Recent adaptation of deep learning methods for ER mitigates the need for dataset-specific feature engineering by constructing distributed representations of entity records. While these methods achieve state-of-the-art performance over benchmark data, they require large amounts of labeled data, which are typically unavailable in realistic ER applications. In this paper, we develop a deep learning-based method that targets low-resource settings for ER through a novel combination of transfer learning and active learning. We design an architecture that allows us to learn a transferable model from a high-resource setting to a low-resource one. To further adapt to the target dataset, we incorporate active learning that carefully selects a few informative examples to fine-tune the transferred model. Empirical evaluation demonstrates that our method achieves comparable, if not better, performance compared to state-of-the-art learning-based methods while using an order of magnitude fewer labels.
Anthology ID:
P19-1586
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Editors:
Anna Korhonen, David Traum, Lluís Màrquez
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5851–5861
Language:
URL:
https://aclanthology.org/P19-1586
DOI:
10.18653/v1/P19-1586
Bibkey:
Cite (ACL):
Jungo Kasai, Kun Qian, Sairam Gurajada, Yunyao Li, and Lucian Popa. 2019. Low-resource Deep Entity Resolution with Transfer and Active Learning. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5851–5861, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Low-resource Deep Entity Resolution with Transfer and Active Learning (Kasai et al., ACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/P19-1586.pdf
Video:
 https://aclanthology.org/P19-1586.mp4