Florian Barth
2022
MONAPipe: Modes of Narration and Attribution Pipeline for German Computational Literary Studies and Language Analysis in spaCy
Tillmann Dönicke
|
Florian Barth
|
Hanna Varachkina
|
Caroline Sporleder
Proceedings of the 18th Conference on Natural Language Processing (KONVENS 2022)
Levels of Non-Fictionality in Fictional Texts
Florian Barth
|
Hanna Varachkina
|
Tillmann Dönicke
|
Luisa Gödeke
Proceedings of the 18th Joint ACL - ISO Workshop on Interoperable Semantic Annotation within LREC2022
The annotation and automatic recognition of non-fictional discourse within a text is an important, yet unresolved task in literary research. While non-fictional passages can consist of several clauses or sentences, we argue that 1) an entity-level classification of fictionality and 2) the linking of Wikidata identifiers can be used to automatically identify (non-)fictional discourse. We query Wikidata and DBpedia for relevant information about a requested entity as well as the corresponding literary text to determine the entity’s fictionality status and assign a Wikidata identifier, if unequivocally possible. We evaluate our methods on an exemplary text from our diachronic literary corpus, where our methods classify 97% of persons and 62% of locations correctly as fictional or real. Furthermore, 75% of the resolved persons and 43% of the resolved locations are resolved correctly. In a quantitative experiment, we apply the entity-level fictionality tagger to our corpus and conclude that more non-fictional passages can be identified when information about real entities is available.
Search