The world’s first South Sámi TTS – a revitalisation effort of an endangered language by reviving a legacy voice

Katri Hiovain-Asikainen, Thomas B. Kjærstad, Maja Lisa Kappfjell, Sjur N. Moshagen


Abstract
South Sámi (ISO 639: SMA) is a severely endangered language spoken by the South Sámi people in Norway and Sweden. Estimates of the number of speakers vary from 500 to 600. Recent advances in speech technology and the general increase in popularity of spoken language and audio content have facilitated the development of modern speech technology tools also for minority languages, such as the Sámi languages. The current paper documents the development process of the world’s first South Sámi text-to-speech (TTS) system, using only digitized archive materials from 1989–1993 as the training material. To reach an end-user suitable quality of the TTS, we have used a neural, end-to-end approach with a rule-based text processing module. The aim of our project is to contribute to the language revitalization by offering tools for language users to use spoken language in new contexts. Since the modern written standard of South Sámi was established as late as in 1978, the rise of speech technology might encourage language use even for people who are not accustomed to the written standar.
Anthology ID:
2025.iwclul-1.3
Volume:
Proceedings of the 10th International Workshop on Computational Linguistics for Uralic Languages
Month:
December
Year:
2025
Address:
Joensuu, Finland
Editors:
Mika Hämäläinen, Michael Rießler, Eiaki V. Morooka, Lev Kharlashkin
Venues:
IWCLUL | WS
SIG:
SIGUR
Publisher:
Association for Computational Linguistics
Note:
Pages:
12–21
Language:
URL:
https://aclanthology.org/2025.iwclul-1.3/
DOI:
Bibkey:
Cite (ACL):
Katri Hiovain-Asikainen, Thomas B. Kjærstad, Maja Lisa Kappfjell, and Sjur N. Moshagen. 2025. The world’s first South Sámi TTS – a revitalisation effort of an endangered language by reviving a legacy voice. In Proceedings of the 10th International Workshop on Computational Linguistics for Uralic Languages, pages 12–21, Joensuu, Finland. Association for Computational Linguistics.
Cite (Informal):
The world’s first South Sámi TTS – a revitalisation effort of an endangered language by reviving a legacy voice (Hiovain-Asikainen et al., IWCLUL 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.iwclul-1.3.pdf