Maria Konstantinidou
2024
Exploring intertextuality across the Homeric poems through language models
Maria Konstantinidou
|
John Pavlopoulos
|
Elton Barker
Proceedings of the 1st Workshop on Machine Learning for Ancient Languages (ML4AL 2024)
Past research has modelled statistically the language of the Homeric poems, assessing the degree of surprisal for each verse through diverse metrics and resulting to the HoLM resource. In this study we utilise the HoLM resource to explore cross poem affinity at the verse level, looking at Iliadic verses and passages that are less surprising to the Odyssean model than to the Iliadic one and vice-versa. Using the same tool, we investigate verses that evoke greater surprise when assessed by a local model trained solely on their source book, compared to a global model trained on the entire source poem. Investigating deeper on the distribution of such verses across the Homeric poems we employ machine learning text classification to further analyse quantitatively cross-poem affinity in selected books.
HoLM: Analyzing the Linguistic Unexpectedness in Homeric Poetry
John Pavlopoulos
|
Ryan Sandell
|
Maria Konstantinidou
|
Chiara Bozzone
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
The authorship of the Homeric poems has been a matter of debate for centuries. Computational approaches such as language modeling exist that can aid experts in making crucial headway. We observe, however, that such work has, thus far, only been carried out at the level of lengthier excerpts, but not individual verses, the level at which most suspected interpolations occur. We address this weakness by presenting a corpus of Homeric verses, each complemented with a score quantifying linguistic unexpectedness based on Perplexity. We assess the nature of these scores by exploring their correlation with named entities, the frequency of character n-grams, and (inverse) word frequency, revealing robust correlations with the latter two. This apparent bias can be partly overcome by simply dividing scores for unexpectedness by the maximum term frequency per verse.
2023
Dating Greek Papyri with Text Regression
John Pavlopoulos
|
Maria Konstantinidou
|
Isabelle Marthot-Santaniello
|
Holger Essler
|
Asimina Paparigopoulou
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Dating Greek papyri accurately is crucial not only to edit their texts but also to understand numerous other aspects of ancient writing, document and book production and circulation, as well as various other aspects of administration, everyday life and intellectual history of antiquity. Although a substantial number of Greek papyri documents bear a date or other conclusive data as to their chronological placement, an even larger number can only be dated tentatively or in approximation, due to the lack of decisive evidence. By creating a dataset of 389 transcriptions of documentary Greek papyri, we train 389 regression models and we predict a date for the papyri with an average MAE of 54 years and an MSE of 1.17, outperforming image classifiers and other baselines. Last, we release date estimations for 159 manuscripts, for which only the upper limit is known.
Search