NS-Hunter: BERT-Cloze Based Semantic Denoising for Distantly Supervised Relation Classification

Shen Tielin, Wang Daling, Feng Shi, Zhang Yifei


Abstract
Distant supervision can generate large-scale relation classification data quickly and economi-cally. However a great number of noise sentences are introduced which can not express their labeled relations. By means of pre-trained language model BERT’s powerful function in this paper we propose a BERT-based semantic denoising approach for distantly supervised relation classification. In detail we define an entity pair as a source entity and a target entity. For the specific sentences whose target entities in BERT-vocabulary (one-token word) we present the differences of dependency between two entities for noise and non-noise sentences. For general sentences whose target entity is multi-token word we further present the differences of last hid-den states of [MASK]-entity (MASK-lhs for short) in BERT for noise and non-noise sentences. We regard the dependency and MASK-lhs in BERT as two semantic features of sentences. With BERT we capture the dependency feature to discriminate specific sentences first then capturethe MASK-lhs feature to denoise distant supervision datasets. We propose NS-Hunter a noveldenoising model which leverages BERT-cloze ability to capture the two semantic features andintegrates above functions. According to the experiment on NYT data our NS-Hunter modelachieves the best results in distant supervision denoising and sentence-level relation classification. Keywords: Distant supervision relation classification semantic denoisingIntroduction
Anthology ID:
2021.ccl-1.99
Volume:
Proceedings of the 20th Chinese National Conference on Computational Linguistics
Month:
August
Year:
2021
Address:
Huhhot, China
Editors:
Sheng Li (李生), Maosong Sun (孙茂松), Yang Liu (刘洋), Hua Wu (吴华), Kang Liu (刘康), Wanxiang Che (车万翔), Shizhu He (何世柱), Gaoqi Rao (饶高琦)
Venue:
CCL
SIG:
Publisher:
Chinese Information Processing Society of China
Note:
Pages:
1109–1120
Language:
English
URL:
https://aclanthology.org/2021.ccl-1.99
DOI:
Bibkey:
Cite (ACL):
Shen Tielin, Wang Daling, Feng Shi, and Zhang Yifei. 2021. NS-Hunter: BERT-Cloze Based Semantic Denoising for Distantly Supervised Relation Classification. In Proceedings of the 20th Chinese National Conference on Computational Linguistics, pages 1109–1120, Huhhot, China. Chinese Information Processing Society of China.
Cite (Informal):
NS-Hunter: BERT-Cloze Based Semantic Denoising for Distantly Supervised Relation Classification (Tielin et al., CCL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.ccl-1.99.pdf