Assessing Large Language Models in Translating Coptic and Ancient Greek Ostraca

Audric-Charles Wannaz, So Miyagawa


Abstract
The advent of Large Language Models (LLMs) substantially raised the quality and lowered the cost of Machine Translation (MT). Can scholars working with ancient languages draw benefits from this new technology? More specifically, can current MT facilitate multilingual digital papyrology? To answer this question, we evaluate 9 LLMs in the task of MT with 4 Coptic and 4 Ancient Greek ostraca into English using 6 NLP metrics. We argue that some models have already reached a performance apt to assist human experts. As can be expected from the difference in training corpus size, all models seem to perform better with Ancient Greek than with Coptic, where hallucinations are markedly more common. In the Coptic texts, the specialised Coptic Translator (CT) competes closely with Claude 3 Opus for the rank of most promising tool, while Claude 3 Opus and GPT-4o compete for the same position in the Ancient Greek texts. We argue that MT now substantially heightens the incentive to work on multilingual corpora. This could have a positive and long-lasting effect on Classics and Egyptology and help reduce the historical bias in translation availability. In closing, we reflect upon the need to meet AI-generated translations with an adequate critical stance.
Anthology ID:
2024.nlp4dh-1.44
Volume:
Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities
Month:
November
Year:
2024
Address:
Miami, USA
Editors:
Mika Hämäläinen, Emily Öhman, So Miyagawa, Khalid Alnajjar, Yuri Bizzoni
Venue:
NLP4DH
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
463–471
Language:
URL:
https://aclanthology.org/2024.nlp4dh-1.44
DOI:
Bibkey:
Cite (ACL):
Audric-Charles Wannaz and So Miyagawa. 2024. Assessing Large Language Models in Translating Coptic and Ancient Greek Ostraca. In Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities, pages 463–471, Miami, USA. Association for Computational Linguistics.
Cite (Informal):
Assessing Large Language Models in Translating Coptic and Ancient Greek Ostraca (Wannaz & Miyagawa, NLP4DH 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.nlp4dh-1.44.pdf