Fine-Tuning Whisper for Kildin Sami

Enzo Gamboni


Abstract
For this study, Whisper, an automatic speech recognition software, was fine-tuned on Kildin Sami, an endangered and low-resource Uralic language, using an automatic speech recognition-tailored dataset of less than 30 minutes. Three different Whisper models were trained with this dataset—each one with a different base language (English, Finnish, or Russian)—to examine which model provided the best result. Results were measured using Word Error Rate; fine-tuning the Russian-base Whisper model resulted in the lowest Word Error Rate at 68.55%. While still high, this result is impressive for only a small amount of language-specific training data, and the training process yielded insights relevant for potential for further work.
Anthology ID:
2025.iwclul-1.13
Volume:
Proceedings of the 10th International Workshop on Computational Linguistics for Uralic Languages
Month:
December
Year:
2025
Address:
Joensuu, Finland
Editors:
Mika Hämäläinen, Michael Rießler, Eiaki V. Morooka, Lev Kharlashkin
Venues:
IWCLUL | WS
SIG:
SIGUR
Publisher:
Association for Computational Linguistics
Note:
Pages:
106–111
Language:
URL:
https://aclanthology.org/2025.iwclul-1.13/
DOI:
Bibkey:
Cite (ACL):
Enzo Gamboni. 2025. Fine-Tuning Whisper for Kildin Sami. In Proceedings of the 10th International Workshop on Computational Linguistics for Uralic Languages, pages 106–111, Joensuu, Finland. Association for Computational Linguistics.
Cite (Informal):
Fine-Tuning Whisper for Kildin Sami (Gamboni, IWCLUL 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.iwclul-1.13.pdf