Evaluating Computational Representations of Character: An Austen Character Similarity Benchmark

Funing Yang; Carolyn Jane Anderson

doi:10.18653/v1/2024.nlp4dh-1.3

Evaluating Computational Representations of Character: An Austen Character Similarity Benchmark

Abstract

Several systems have been developed to extract information about characters to aid computational analysis of English literature. We propose character similarity grouping as a holistic evaluation task for these pipelines. We present AustenAlike, a benchmark suite of character similarities in Jane Austen’s novels. Our benchmark draws on three notions of character similarity: a structurally defined notion of similarity; a socially defined notion of similarity; and an expert defined set extracted from literary criticism. We use AustenAlike to evaluate character features extracted using two pipelines, BookNLP and FanfictionNLP. We build character representations from four kinds of features and compare them to the three AustenAlike benchmarks and to GPT-4 similarity rankings. We find that though computational representations capture some broad similarities based on shared social and narrative roles, the expert pairings in our third benchmark are challenging for all systems, highlighting the subtler aspects of similarity noted by human readers.

Anthology ID:: 2024.nlp4dh-1.3
Volume:: Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities
Month:: November
Year:: 2024
Address:: Miami, USA
Editors:: Mika Hämäläinen, Emily Öhman, So Miyagawa, Khalid Alnajjar, Yuri Bizzoni
Venues:: NLP4DH | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 17–30
Language:
URL:: https://aclanthology.org/2024.nlp4dh-1.3/
DOI:: 10.18653/v1/2024.nlp4dh-1.3
Bibkey:
Cite (ACL):: Funing Yang and Carolyn Jane Anderson. 2024. Evaluating Computational Representations of Character: An Austen Character Similarity Benchmark. In Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities, pages 17–30, Miami, USA. Association for Computational Linguistics.
Cite (Informal):: Evaluating Computational Representations of Character: An Austen Character Similarity Benchmark (Yang & Anderson, NLP4DH 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.nlp4dh-1.3.pdf

PDF Cite Search Fix data