LLM-based Machine Translation and Summarization for Latin

Martin Volk, Dominic Philipp Fischer, Lukas Fischer, Patricia Scheurer, Phillip Benjamin Ströbel


Abstract
This paper presents an evaluation of machine translation for Latin. We tested multilingual Large Language Models, in particular GPT-4, on letters from the 16th century that are in Latin and Early New High German. Our experiments include translation and cross-language summarization for the two historical languages into modern English and German. We show that LLM-based translation for Latin is clearly superior to previous approaches. We also show that LLM-based paraphrasing of Latin paragraphs from the historical letters produces English and German summaries that are close to human summaries published in the edition.
Anthology ID:
2024.lt4hala-1.15
Volume:
Proceedings of the Third Workshop on Language Technologies for Historical and Ancient Languages (LT4HALA) @ LREC-COLING-2024
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Rachele Sprugnoli, Marco Passarotti
Venues:
LT4HALA | WS
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
122–128
Language:
URL:
https://aclanthology.org/2024.lt4hala-1.15
DOI:
Bibkey:
Cite (ACL):
Martin Volk, Dominic Philipp Fischer, Lukas Fischer, Patricia Scheurer, and Phillip Benjamin Ströbel. 2024. LLM-based Machine Translation and Summarization for Latin. In Proceedings of the Third Workshop on Language Technologies for Historical and Ancient Languages (LT4HALA) @ LREC-COLING-2024, pages 122–128, Torino, Italia. ELRA and ICCL.
Cite (Informal):
LLM-based Machine Translation and Summarization for Latin (Volk et al., LT4HALA-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.lt4hala-1.15.pdf