Fine-Tuning Whisper for Kildin Sami

Enzo Gamboni

Fine-Tuning Whisper for Kildin Sami

Abstract

For this study, Whisper, an automatic speech recognition software, was fine-tuned on Kildin Sami, an endangered and low-resource Uralic language, using an automatic speech recognition-tailored dataset of less than 30 minutes. Three different Whisper models were trained with this dataset—each one with a different base language (English, Finnish, or Russian)—to examine which model provided the best result. Results were measured using Word Error Rate; fine-tuning the Russian-base Whisper model resulted in the lowest Word Error Rate at 68.55%. While still high, this result is impressive for only a small amount of language-specific training data, and the training process yielded insights relevant for potential for further work.

Anthology ID:: 2025.iwclul-1.13
Volume:: Proceedings of the 10th International Workshop on Computational Linguistics for Uralic Languages
Month:: December
Year:: 2025
Address:: Joensuu, Finland
Editors:: Mika Hämäläinen, Michael Rießler, Eiaki V. Morooka, Lev Kharlashkin
Venues:: IWCLUL | WS
SIG:: SIGUR
Publisher:: Association for Computational Linguistics
Note:
Pages:: 106–111
Language:
URL:: https://aclanthology.org/2025.iwclul-1.13/
DOI:
Bibkey:
Cite (ACL):: Enzo Gamboni. 2025. Fine-Tuning Whisper for Kildin Sami. In Proceedings of the 10th International Workshop on Computational Linguistics for Uralic Languages, pages 106–111, Joensuu, Finland. Association for Computational Linguistics.
Cite (Informal):: Fine-Tuning Whisper for Kildin Sami (Gamboni, IWCLUL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.iwclul-1.13.pdf

PDF Cite Search Fix data