2024
pdf
bib
abs
A Matter of Perspective: Building a Multi-Perspective Annotated Dataset for the Study of Literary Quality
Yuri Bizzoni
|
Pascale Feldkamp Moreira
|
Ida Marie S. Lassen
|
Mads Rosendahl Thomsen
|
Kristoffer Nielbo
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Studies on literary quality have constantly stimulated the interest of critics, both in theoretical and empirical fields. To examine the perceived quality of literary works, some approaches have focused on data annotated through crowd-sourcing platforms, and others relied on available expert annotated data. In this work, we contribute to the debate by presenting a dataset collecting quality judgments on 9,000 19th and 20th century English-language literary novels by 3,150 predominantly Anglophone authors. We incorporate expert opinions and crowd-sourced annotations to allow comparative analyses between different literary quality evaluations. We also provide several textual metrics chosen for their potential connection with literary reception and engagement. While a large part of the texts is subjected to copyright, we release quality and reception measures together with stylometric and sentiment data for each of the 9,000 novels to promote future research and comparison.
2023
pdf
bib
abs
Sentimental Matters - Predicting Literary Quality by Sentiment Analysis and Stylometric Features
Yuri Bizzoni
|
Pascale Moreira
|
Mads Rosendahl Thomsen
|
Kristoffer Nielbo
Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis
Over the years, the task of predicting reader appreciation or literary quality has been the object of several studies, but it remains a challenging problem in quantitative literary studies and computational linguistics alike, as its definition can vary a lot depending on the genre, the adopted features and the annotation system. This paper attempts to evaluate the impact of sentiment arc modelling versus more classical stylometric features for user-ratings of novels. We run our experiments on a corpus of English language narrative literary fiction from the 19th and 20th century, showing that syntactic and surface-level features can be powerful for the study of literary quality, but can be outperformed by sentiment-characteristics of a text.
pdf
bib
abs
Readability and Complexity: Diachronic Evolution of Literary Language Across 9000 Novels
Pascale Feldkamp
|
Yuri Bizzoni
|
Ida Marie S. Lassen
|
Mads Rosendahl Thomsen
|
Kristoffer Nielbo
Proceedings of the Joint 3rd International Conference on Natural Language Processing for Digital Humanities and 8th International Workshop on Computational Linguistics for Uralic Languages
Using a large corpus of English language novels from 1880 to 2000, we compare several textual features associated with literary quality, seeking to examine developments in literary language and narrative complexity through time. We show that while we find a correlation between the features, readability metrics are the only ones that exhibit a steady evolution, indicating that novels become easier to read through the 20th century but not simpler. We discuss the possibility of cultural selection as a factor and compare our findings with a subset of canonical works.
2022
pdf
bib
abs
Fractality of sentiment arcs for literary quality assessment: The case of Nobel laureates
Yuri Bizzoni
|
Kristoffer Laigaard Nielbo
|
Mads Rosendahl Thomsen
Proceedings of the 2nd International Workshop on Natural Language Processing for Digital Humanities
In the few works that have used NLP to study literary quality, sentiment and emotion analysis have often been considered valuable sources of information. At the same time, the idea that the nature and polarity of the sentiments expressed by a novel might have something to do with its perceived quality seems limited at best. In this paper, we argue that the fractality of narratives, specifically the long-term memory of their sentiment arcs, rather than their simple shape or average valence, might play an important role in the perception of literary quality by a human audience. In particular, we argue that such measure can help distinguish Nobel-winning writers from control groups in a recent corpus of English language novels. To test this hypothesis, we present the results from two studies: (i) a probability distribution test, where we compute the probability of seeing a title from a Nobel laureate at different levels of arc fractality; (ii) a classification test, where we use several machine learning algorithms to measure the predictive power of both sentiment arcs and their fractality measure. Our findings seem to indicate that despite the competitive and complex nature of the task, the populations of Nobel and non-Nobel laureates seem to behave differently and can to some extent be told apart by a classifier.
pdf
bib
abs
Predicting Literary Quality How Perspectivist Should We Be?
Yuri Bizzoni
|
Ida Marie Lassen
|
Telma Peura
|
Mads Rosendahl Thomsen
|
Kristoffer Nielbo
Proceedings of the 1st Workshop on Perspectivist Approaches to NLP @LREC2022
Approaches in literary quality tend to belong to two main grounds: one sees quality as completely subjective, relying on the idiosyncratic nature of individual perspectives on the apperception of beauty; the other is ground-truth inspired, and attempts to find one or two values that predict something like an objective quality: the number of copies sold, for example, or the winning of a prestigious prize. While the first school usually does not try to predict quality at all, the second relies on a single majority vote in one form or another. In this article we discuss the advantages and limitations of these schools of thought and describe a different approach to reader’s quality judgments, which moves away from raw majority vote, but does try to create intermediate classes or groups of annotators. Drawing on previous works we describe the benefits and drawbacks of building similar annotation classes. Finally we share early results from a large corpus of literary reviews for an insight into which classes of readers might make most sense when dealing with the appreciation of literary quality.
2021
pdf
bib
abs
Sentiment Dynamics of Success: Fractal Scaling of Story Arcs Predicts Reader Preferences
Yuri Bizzoni
|
Telma Peura
|
Mads Rosendahl Thomsen
|
Kristoffer Nielbo
Proceedings of the Workshop on Natural Language Processing for Digital Humanities
e explore the correlation between the sentiment arcs of H. C. Andersen’s fairy tales and their popularity, measured as their average score on the platform GoodReads. Specifically, we do not conceive a story’s overall sentimental trend as predictive per se, but we focus on its coherence and predictability over time as represented by the arc’s Hurst exponent. We find that degrading Hurst values tend to imply degrading quality scores, while a Hurst exponent between .55 and .65 might indicate a “sweet spot” for literary appreciation.