FST Morphology for the Endangered Skolt Sami Language

Jack Rueter, Mika Hämäläinen


Abstract
We present advances in the development of a FST-based morphological analyzer and generator for Skolt Sami. Like other minority Uralic languages, Skolt Sami exhibits a rich morphology, on the one hand, and there is little golden standard material for it, on the other. This makes NLP approaches for its study difficult without a solid morphological analysis. The language is severely endangered and the work presented in this paper forms a part of a greater whole in its revitalization efforts. Furthermore, we intersperse our description with facilitation and description practices not well documented in the infrastructure. Currently, the analyzer covers over 30,000 Skolt Sami words in 148 inflectional paradigms and over 12 derivational forms.
Anthology ID:
2020.sltu-1.35
Volume:
Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL)
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Dorothee Beermann, Laurent Besacier, Sakriani Sakti, Claudia Soria
Venue:
SLTU
SIG:
Publisher:
European Language Resources association
Note:
Pages:
250–257
Language:
English
URL:
https://aclanthology.org/2020.sltu-1.35
DOI:
Bibkey:
Cite (ACL):
Jack Rueter and Mika Hämäläinen. 2020. FST Morphology for the Endangered Skolt Sami Language. In Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL), pages 250–257, Marseille, France. European Language Resources association.
Cite (Informal):
FST Morphology for the Endangered Skolt Sami Language (Rueter & Hämäläinen, SLTU 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.sltu-1.35.pdf
Code
 giellalt/lang-sms