Evan Hansen
2025
Kildin Saami-Russian-(English) Parallel Corpus Building
Evan Hansen
Proceedings of the 10th International Workshop on Computational Linguistics for Uralic Languages
This paper presents two parallel corpora of written Kildin Saami and the process of their compilation. The first, a dictionary corpus, contains 101,889 Kildin Saami tokens of example phrases/sentences from three Russian-Kildin Saami dictionaries and the glossary of the nonfiction book Saami ornaments, accompanied by the examples’ respective headwords and translations into up to four other languages. Headwords where possible are paired with their underived base, making it a suitable resource for investigating questions surrounding morphological derivation in Kildin Saami. The second corpus comprises 23,884 Kildin Saami tokens and was compiled from Saami ornaments, a trilingual (Russian-Kildin Saami-English) book introducing various Saami handicrafts and their creators from across Russian Sápmi.