Fotis Jannidis


2021

pdf bib
Detecting Scenes in Fiction: A new Segmentation Task
Albin Zehe | Leonard Konle | Lea Katharina Dümpelmann | Evelyn Gius | Andreas Hotho | Fotis Jannidis | Lucas Kaufmann | Markus Krug | Frank Puppe | Nils Reiter | Annekea Schreiber | Nathalie Wiedmer
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

This paper introduces the novel task of scene segmentation on narrative texts and provides an annotated corpus, a discussion of the linguistic and narrative properties of the task and baseline experiments towards automatic solutions. A scene here is a segment of the text where time and discourse time are more or less equal, the narration focuses on one action and location and character constellations stay the same. The corpus we describe consists of German-language dime novels (550k tokens) that have been annotated in parallel, achieving an inter-annotator agreement of gamma = 0.7. Baseline experiments using BERT achieve an F1 score of 24%, showing that the task is very challenging. An automatic scene segmentation paves the way towards processing longer narrative texts like tales or novels by breaking them down into smaller, coherent and meaningful parts, which is an important stepping stone towards the reconstruction of plot in Computational Literary Studies but also can serve to improve tasks like coreference resolution.

2020

pdf bib
Twenty-two Historical Encyclopedias Encoded in TEI: a New Resource for the Digital Humanities
Thora Hagen | Erik Ketzan | Fotis Jannidis | Andreas Witt
Proceedings of the The 4th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature

This paper accompanies the corpus publication of EncycNet, a novel XML/TEI annotated corpus of 22 historical German encyclopedias from the early 18th to early 20th century. We describe the creation and annotation of the corpus, including the rationale for its development, suggested methodology for TEI annotation, possible use cases and future work. While many well-developed annotation standards for lexical resources exist, none can adequately model the encyclopedias at hand, and we therefore suggest how the TEI Lex-0 standard may be modified with additional guidelines for the annotation of historical encyclopedias. As the digitization and annotation of historical encyclopedias are settling on TEI as the de facto standard, our methodology may inform similar projects.

pdf bib
Corpus REDEWIEDERGABE
Annelen Brunner | Stefan Engelberg | Fotis Jannidis | Ngoc Duyen Tanja Tu | Lukas Weimer
Proceedings of the Twelfth Language Resources and Evaluation Conference

This article presents corpus REDEWIEDERGABE, a German-language historical corpus with detailed annotations for speech, thought and writing representation (ST&WR). With approximately 490,000 tokens, it is the largest resource of its kind. It can be used to answer literary and linguistic research questions and serve as training material for machine learning. This paper describes the composition of the corpus and the annotation structure, discusses some methodological decisions and gives basic statistics about the forms of ST&WR found in this corpus.

2018

pdf bib
Delta vs. N-Gram Tracing: Evaluating the Robustness of Authorship Attribution Methods
Thomas Proisl | Stefan Evert | Fotis Jannidis | Christof Schöch | Leonard Konle | Steffen Pielström
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2015

pdf bib
Towards a better understanding of Burrows’s Delta in literary authorship attribution
Stefan Evert | Thomas Proisl | Thorsten Vitt | Christof Schöch | Fotis Jannidis | Steffen Pielström
Proceedings of the Fourth Workshop on Computational Linguistics for Literature

pdf bib
Rule-based Coreference Resolution in German Historic Novels
Markus Krug | Frank Puppe | Fotis Jannidis | Luisa Macharowsky | Isabella Reger | Lukas Weimar
Proceedings of the Fourth Workshop on Computational Linguistics for Literature