Lacuna Language Learning: Leveraging RNNs for Ranked Text Completion in Digitized Coptic Manuscripts

Lauren Levine; Cindy Li; Lydia Bremer-McCollum; Nicholas Wagner; Amir Zeldes

doi:10.18653/v1/2024.ml4al-1.8

Lacuna Language Learning: Leveraging RNNs for Ranked Text Completion in Digitized Coptic Manuscripts

Lauren Levine, Cindy Li, Lydia Bremer-McCollum, Nicholas Wagner, Amir Zeldes

Abstract

Ancient manuscripts are frequently damaged, containing gaps in the text known as lacunae. In this paper, we present a bidirectional RNN model for character prediction of Coptic characters in manuscript lacunae. Our best model performs with 72% accuracy on single character reconstruction, but falls to 37% when reconstructing lacunae of various lengths. While not suitable for definitive manuscript reconstruction, we argue that our RNN model can help scholars rank the likelihood of textual reconstructions. As evidence, we use our RNN model to rank reconstructions in two early Coptic manuscripts. Our investigation shows that neural models can augment traditional methods of textual restoration, providing scholars with an additional tool to assess lacunae in Coptic manuscripts.

Anthology ID:: 2024.ml4al-1.8
Volume:: Proceedings of the 1st Workshop on Machine Learning for Ancient Languages (ML4AL 2024)
Month:: August
Year:: 2024
Address:: Hybrid in Bangkok, Thailand and online
Editors:: John Pavlopoulos, Thea Sommerschield, Yannis Assael, Shai Gordin, Kyunghyun Cho, Marco Passarotti, Rachele Sprugnoli, Yudong Liu, Bin Li, Adam Anderson
Venues:: ML4AL | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 61–70
Language:
URL:: https://aclanthology.org/2024.ml4al-1.8/
DOI:: 10.18653/v1/2024.ml4al-1.8
Bibkey:
Cite (ACL):: Lauren Levine, Cindy Li, Lydia Bremer-McCollum, Nicholas Wagner, and Amir Zeldes. 2024. Lacuna Language Learning: Leveraging RNNs for Ranked Text Completion in Digitized Coptic Manuscripts. In Proceedings of the 1st Workshop on Machine Learning for Ancient Languages (ML4AL 2024), pages 61–70, Hybrid in Bangkok, Thailand and online. Association for Computational Linguistics.
Cite (Informal):: Lacuna Language Learning: Leveraging RNNs for Ranked Text Completion in Digitized Coptic Manuscripts (Levine et al., ML4AL 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.ml4al-1.8.pdf

PDF Cite Search Fix data