Profiling of Intertextuality in Latin Literature Using Word Embeddings

Patrick J. Burns, James A. Brofos, Kyle Li, Pramit Chaudhuri, Joseph P. Dexter


Abstract
Identifying intertextual relationships between authors is of central importance to the study of literature. We report an empirical analysis of intertextuality in classical Latin literature using word embedding models. To enable quantitative evaluation of intertextual search methods, we curate a new dataset of 945 known parallels drawn from traditional scholarship on Latin epic poetry. We train an optimized word2vec model on a large corpus of lemmatized Latin, which achieves state-of-the-art performance for synonym detection and outperforms a widely used lexical method for intertextual search. We then demonstrate that training embeddings on very small corpora can capture salient aspects of literary style and apply this approach to replicate a previous intertextual study of the Roman historian Livy, which relied on hand-crafted stylometric features. Our results advance the development of core computational resources for a major premodern language and highlight a productive avenue for cross-disciplinary collaboration between the study of literature and NLP.
Anthology ID:
2021.naacl-main.389
Volume:
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
June
Year:
2021
Address:
Online
Editors:
Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tur, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, Yichao Zhou
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4900–4907
Language:
URL:
https://aclanthology.org/2021.naacl-main.389
DOI:
10.18653/v1/2021.naacl-main.389
Bibkey:
Cite (ACL):
Patrick J. Burns, James A. Brofos, Kyle Li, Pramit Chaudhuri, and Joseph P. Dexter. 2021. Profiling of Intertextuality in Latin Literature Using Word Embeddings. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4900–4907, Online. Association for Computational Linguistics.
Cite (Informal):
Profiling of Intertextuality in Latin Literature Using Word Embeddings (Burns et al., NAACL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.naacl-main.389.pdf
Video:
 https://aclanthology.org/2021.naacl-main.389.mp4
Code
 quantitativecriticismlab/naacl-hlt-2021-latin-intertextuality