Kildin Saami-Russian-(English) Parallel Corpus Building

Evan Hansen


Abstract
This paper presents two parallel corpora of written Kildin Saami and the process of their compilation. The first, a dictionary corpus, contains 101,889 Kildin Saami tokens of example phrases/sentences from three Russian-Kildin Saami dictionaries and the glossary of the nonfiction book Saami ornaments, accompanied by the examples’ respective headwords and translations into up to four other languages. Headwords where possible are paired with their underived base, making it a suitable resource for investigating questions surrounding morphological derivation in Kildin Saami. The second corpus comprises 23,884 Kildin Saami tokens and was compiled from Saami ornaments, a trilingual (Russian-Kildin Saami-English) book introducing various Saami handicrafts and their creators from across Russian Sápmi.
Anthology ID:
2025.iwclul-1.7
Volume:
Proceedings of the 10th International Workshop on Computational Linguistics for Uralic Languages
Month:
December
Year:
2025
Address:
Joensuu, Finland
Editors:
Mika Hämäläinen, Michael Rießler, Eiaki V. Morooka, Lev Kharlashkin
Venues:
IWCLUL | WS
SIG:
SIGUR
Publisher:
Association for Computational Linguistics
Note:
Pages:
49–56
Language:
URL:
https://aclanthology.org/2025.iwclul-1.7/
DOI:
Bibkey:
Cite (ACL):
Evan Hansen. 2025. Kildin Saami-Russian-(English) Parallel Corpus Building. In Proceedings of the 10th International Workshop on Computational Linguistics for Uralic Languages, pages 49–56, Joensuu, Finland. Association for Computational Linguistics.
Cite (Informal):
Kildin Saami-Russian-(English) Parallel Corpus Building (Hansen, IWCLUL 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.iwclul-1.7.pdf