The FairyNet Corpus - Character Networks for German Fairy Tales

David Schmidt, Albin Zehe, Janne Lorenzen, Lisa Sergel, Sebastian Düker, Markus Krug, Frank Puppe


Abstract
This paper presents a data set of German fairy tales, manually annotated with character networks which were obtained with high inter rater agreement. The release of this corpus provides an opportunity of training and comparing different algorithms for the extraction of character networks, which so far was barely possible due to heterogeneous interests of previous researchers. We demonstrate the usefulness of our data set by providing baseline experiments for the automatic extraction of character networks, applying a rule-based pipeline as well as a neural approach, and find the neural approach outperforming the rule-approach in most evaluation settings.
Anthology ID:
2021.latechclfl-1.6
Volume:
Proceedings of the 5th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic (online)
Venues:
CLFL | EMNLP | LaTeCH | LaTeCHCLfL
SIG:
SIGHUM
Publisher:
Association for Computational Linguistics
Note:
Pages:
49–56
Language:
URL:
https://aclanthology.org/2021.latechclfl-1.6
DOI:
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2021.latechclfl-1.6.pdf