Empower Entity Set Expansion via Language Model Probing

Yunyi Zhang, Jiaming Shen, Jingbo Shang, Jiawei Han


Abstract
Entity set expansion, aiming at expanding a small seed entity set with new entities belonging to the same semantic class, is a critical task that benefits many downstream NLP and IR applications, such as question answering, query understanding, and taxonomy construction. Existing set expansion methods bootstrap the seed entity set by adaptively selecting context features and extracting new entities. A key challenge for entity set expansion is to avoid selecting ambiguous context features which will shift the class semantics and lead to accumulative errors in later iterations. In this study, we propose a novel iterative set expansion framework that leverages automatically generated class names to address the semantic drift issue. In each iteration, we select one positive and several negative class names by probing a pre-trained language model, and further score each candidate entity based on selected class names. Experiments on two datasets show that our framework generates high-quality class names and outperforms previous state-of-the-art methods significantly.
Anthology ID:
2020.acl-main.725
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Editors:
Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8151–8160
Language:
URL:
https://aclanthology.org/2020.acl-main.725
DOI:
10.18653/v1/2020.acl-main.725
Bibkey:
Cite (ACL):
Yunyi Zhang, Jiaming Shen, Jingbo Shang, and Jiawei Han. 2020. Empower Entity Set Expansion via Language Model Probing. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8151–8160, Online. Association for Computational Linguistics.
Cite (Informal):
Empower Entity Set Expansion via Language Model Probing (Zhang et al., ACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.acl-main.725.pdf
Video:
 http://slideslive.com/38929134
Code
 yzhan238/CGExpan