Morphological Analysis of Sahidic Coptic for Automatic Glossing

Daniel Smith, Mans Hulden


Abstract
We report on the implementation of a morphological analyzer for the Sahidic dialect of Coptic, a now extinct Afro-Asiatic language. The system is developed in the finite-state paradigm. The main purpose of the project is provide a method by which scholars and linguists can semi-automatically gloss extant texts written in Sahidic. Since a complete lexicon containing all attested forms in different manuscripts requires significant expertise in Coptic spanning almost 1,000 years, we have equipped the analyzer with a core lexicon and extended it with a “guesser” ability to capture out-of-vocabulary items in any inflection. We also suggest an ASCII transliteration for the language. A brief evaluation is provided.
Anthology ID:
L16-1411
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
2584–2588
Language:
URL:
https://aclanthology.org/L16-1411
DOI:
Bibkey:
Cite (ACL):
Daniel Smith and Mans Hulden. 2016. Morphological Analysis of Sahidic Coptic for Automatic Glossing. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 2584–2588, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
Morphological Analysis of Sahidic Coptic for Automatic Glossing (Smith & Hulden, LREC 2016)
Copy Citation:
PDF:
https://aclanthology.org/L16-1411.pdf