Sui Generis: Large Language Models for Authorship Attribution and Verification in Latin

Svetlana Gorovaia, Gleb Schmidt, Ivan P. Yamshchikov


Abstract
This paper evaluates the performance of Large Language Models (LLMs) in authorship attribu- tion and authorship verification tasks for Latin texts of the Patristic Era. The study showcases that LLMs can be robust in zero-shot author- ship verification even on short texts without sophisticated feature engineering. Yet, the mod- els can also be easily “mislead” by semantics. The experiments also demonstrate that steering the model’s authorship analysis and decision- making is challenging, unlike what is reported in the studies dealing with high-resource mod- ern languages. Although LLMs prove to be able to beat, under certain circumstances, the traditional baselines, obtaining a nuanced and truly explainable decision requires at best a lot of experimentation.
Anthology ID:
2024.nlp4dh-1.39
Volume:
Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities
Month:
November
Year:
2024
Address:
Miami, USA
Editors:
Mika Hämäläinen, Emily Öhman, So Miyagawa, Khalid Alnajjar, Yuri Bizzoni
Venue:
NLP4DH
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
398–412
Language:
URL:
https://aclanthology.org/2024.nlp4dh-1.39
DOI:
Bibkey:
Cite (ACL):
Svetlana Gorovaia, Gleb Schmidt, and Ivan P. Yamshchikov. 2024. Sui Generis: Large Language Models for Authorship Attribution and Verification in Latin. In Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities, pages 398–412, Miami, USA. Association for Computational Linguistics.
Cite (Informal):
Sui Generis: Large Language Models for Authorship Attribution and Verification in Latin (Gorovaia et al., NLP4DH 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.nlp4dh-1.39.pdf