A Free/Open-Source Morphological Analyser and Generator for Sakha

Sardana Ivanova, Jonathan Washington, Francis Tyers


Abstract
We present, to our knowledge, the first ever published morphological analyser and generator for Sakha, a marginalised language of Siberia. The transducer, developed using HFST, has coverage of solidly above 90%, and high precision. In the development of the analyser, we have expanded linguistic knowledge about Sakha, and developed strategies for complex grammatical patterns. The transducer is already being used in downstream tasks, including computer assisted language learning applications for linguistic maintenance and computational linguistic shared tasks.
Anthology ID:
2022.lrec-1.550
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
5137–5142
Language:
URL:
https://aclanthology.org/2022.lrec-1.550
DOI:
Bibkey:
Cite (ACL):
Sardana Ivanova, Jonathan Washington, and Francis Tyers. 2022. A Free/Open-Source Morphological Analyser and Generator for Sakha. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 5137–5142, Marseille, France. European Language Resources Association.
Cite (Informal):
A Free/Open-Source Morphological Analyser and Generator for Sakha (Ivanova et al., LREC 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.lrec-1.550.pdf
Code
 apertium/apertium-sah