Jannik Fischbach
2021
Semi-Automated Labeling of Requirement Datasets for Relation Extraction
Jeremias Bohn
|
Jannik Fischbach
|
Martin Schmitt
|
Hinrich Schütze
|
Andreas Vogelsang
Proceedings of the 14th Workshop on Building and Using Comparable Corpora (BUCC 2021)
Creating datasets manually by human annotators is a laborious task that can lead to biased and inhomogeneous labels. We propose a flexible, semi-automatic framework for labeling data for relation extraction. Furthermore, we provide a dataset of preprocessed sentences from the requirements engineering domain, including a set of automatically created as well as hand-crafted labels. In our case study, we compare the human and automatic labels and show that there is a substantial overlap between both annotations.