Andreas Vogelsang
Semi-Automated Labeling of Requirement Datasets for Relation Extraction
Jeremias Bohn
Jannik Fischbach
Martin Schmitt
Hinrich Schütze
Andreas Vogelsang
Proceedings of the 14th Workshop on Building and Using Comparable Corpora (BUCC 2021)
Creating datasets manually by human annotators is a laborious task that can lead to biased and inhomogeneous labels. We propose a flexible, semi-automatic framework for labeling data for relation extraction. Furthermore, we provide a dataset of preprocessed sentences from the requirements engineering domain, including a set of automatically created as well as hand-crafted labels. In our case study, we compare the human and automatic labels and show that there is a substantial overlap between both annotations.