Finding Influential Instances for Distantly Supervised Relation Extraction

Zifeng Wang, Rui Wen, Xi Chen, Shao-Lun Huang, Ningyu Zhang, Yefeng Zheng


Abstract
Distant supervision (DS) is a strong way to expand the datasets for enhancing relation extraction (RE) models but often suffers from high label noise. Current works based on attention, reinforcement learning, or GAN are black-box models so they neither provide meaningful interpretation of sample selection in DS nor stability on different domains. On the contrary, this work proposes a novel model-agnostic instance sampling method for DS by influence function (IF), namely REIF. Our method identifies favorable/unfavorable instances in the bag based on IF, then does dynamic instance sampling. We design a fast influence sampling algorithm that reduces the computational complexity from 𝒪(mn) to 𝒪(1), with analyzing its robustness on the selected sampling function. Experiments show that by simply sampling the favorable instances during training, REIF is able to win over a series of baselines which have complicated architectures. We also demonstrate that REIF can support interpretable instance selection.
Anthology ID:
2022.coling-1.233
Volume:
Proceedings of the 29th International Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
2639–2650
Language:
URL:
https://aclanthology.org/2022.coling-1.233
DOI:
Bibkey:
Cite (ACL):
Zifeng Wang, Rui Wen, Xi Chen, Shao-Lun Huang, Ningyu Zhang, and Yefeng Zheng. 2022. Finding Influential Instances for Distantly Supervised Relation Extraction. In Proceedings of the 29th International Conference on Computational Linguistics, pages 2639–2650, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):
Finding Influential Instances for Distantly Supervised Relation Extraction (Wang et al., COLING 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.coling-1.233.pdf