Crowdsourcing Semantic Label Propagation in Relation Classification

Anca Dumitrache, Lora Aroyo, Chris Welty


Abstract
Distant supervision is a popular method for performing relation extraction from text that is known to produce noisy labels. Most progress in relation extraction and classification has been made with crowdsourced corrections to distant-supervised labels, and there is evidence that indicates still more would be better. In this paper, we explore the problem of propagating human annotation signals gathered for open-domain relation classification through the CrowdTruth methodology for crowdsourcing, that captures ambiguity in annotations by measuring inter-annotator disagreement. Our approach propagates annotations to sentences that are similar in a low dimensional embedding space, expanding the number of labels by two orders of magnitude. Our experiments show significant improvement in a sentence-level multi-class relation classifier.
Anthology ID:
W18-5503
Volume:
Proceedings of the First Workshop on Fact Extraction and VERification (FEVER)
Month:
November
Year:
2018
Address:
Brussels, Belgium
Editors:
James Thorne, Andreas Vlachos, Oana Cocarascu, Christos Christodoulopoulos, Arpit Mittal
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
16–21
Language:
URL:
https://aclanthology.org/W18-5503
DOI:
10.18653/v1/W18-5503
Bibkey:
Cite (ACL):
Anca Dumitrache, Lora Aroyo, and Chris Welty. 2018. Crowdsourcing Semantic Label Propagation in Relation Classification. In Proceedings of the First Workshop on Fact Extraction and VERification (FEVER), pages 16–21, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Crowdsourcing Semantic Label Propagation in Relation Classification (Dumitrache et al., EMNLP 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-5503.pdf
Code
 CrowdTruth/Open-Domain-Relation-Extraction