Mads Thomsen


pdf bib
Good Reads and Easy Novels: Readability and Literary Quality in a Corpus of US-published Fiction
Yuri Bizzoni | Pascale Moreira | Nicole Dwenger | Ida Lassen | Mads Thomsen | Kristoffer Nielbo
Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)

In this paper, we explore the extent to which readability contributes to the perception of literary quality as defined by two categories of variables: expert-based (e.g., Pulitzer Prize, National Book Award) and crowd-based (e.g., GoodReads, WorldCat). Based on a large corpus of modern and contemporary fiction in English, we examine the correlation of a text’s readability with its perceived literary quality, also assessing readability measures against simpler stylometric features. Our results show that readability generally correlates with popularity as measured through open platforms such as GoodReads and WorldCat but has an inverse relation with three prestigious literary awards. This points to a distinction between crowd- and expert-based judgments of literary style, as well as to a discrimination between fame and appreciation in the reception of a book.

pdf bib
Modeling Readers’ Appreciation of Literary Narratives Through Sentiment Arcs and Semantic Profiles
Pascale Moreira | Yuri Bizzoni | Kristoffer Nielbo | Ida Marie Lassen | Mads Thomsen
Proceedings of the The 5th Workshop on Narrative Understanding

Predicting literary quality and reader appreciation of narrative texts are highly complex challenges in quantitative and computational literary studies due to the fluid definitions of quality and the vast feature space that can be considered when modeling a literary work. This paper investigates the potential of sentiment arcs combined with topical-semantic profiling of literary narratives as indicators for their literary quality. Our experiments focus on a large corpus of 19th and 20the century English language literary fiction, using GoodReads’ ratings as an imperfect approximation of the diverse range of reader evaluations and preferences. By leveraging a stacked ensemble of regression models, we achieve a promising performance in predicting average readers’ scores, indicating the potential of our approach in modeling literary quality.