Classifier identification in Ancient Egyptian as a low-resource sequence-labelling task

Dmitry Nikolaev, Jorke Grotenhuis, Haleli Harel, Orly Goldwasser


Abstract
The complex Ancient Egyptian (AE) writing system was characterised by widespread use of graphemic classifiers (determinatives): silent (unpronounced) hieroglyphic signs clarifying the meaning or indicating the pronunciation of the host word. The study of classifiers has intensified in recent years with the launch and quick growth of the iClassifier project, a web-based platform for annotation and analysis of classifiers in ancient and modern languages. Thanks to the data contributed by the project participants, it is now possible to formulate the identification of classifiers in AE texts as an NLP task. In this paper, we make first steps towards solving this task by implementing a series of sequence-labelling neural models, which achieve promising performance despite the modest amount of training data. We discuss tokenisation and operationalisation issues arising from tackling AE texts and contrast our approach with frequency-based baselines.
Anthology ID:
2024.ml4al-1.5
Volume:
Proceedings of the 1st Workshop on Machine Learning for Ancient Languages (ML4AL 2024)
Month:
August
Year:
2024
Address:
Hybrid in Bangkok, Thailand and online
Editors:
John Pavlopoulos, Thea Sommerschield, Yannis Assael, Shai Gordin, Kyunghyun Cho, Marco Passarotti, Rachele Sprugnoli, Yudong Liu, Bin Li, Adam Anderson
Venues:
ML4AL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
42–47
Language:
URL:
https://aclanthology.org/2024.ml4al-1.5
DOI:
Bibkey:
Cite (ACL):
Dmitry Nikolaev, Jorke Grotenhuis, Haleli Harel, and Orly Goldwasser. 2024. Classifier identification in Ancient Egyptian as a low-resource sequence-labelling task. In Proceedings of the 1st Workshop on Machine Learning for Ancient Languages (ML4AL 2024), pages 42–47, Hybrid in Bangkok, Thailand and online. Association for Computational Linguistics.
Cite (Informal):
Classifier identification in Ancient Egyptian as a low-resource sequence-labelling task (Nikolaev et al., ML4AL-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.ml4al-1.5.pdf