Enzo Gamboni
2025
Fine-Tuning Whisper for Kildin Sami
Enzo Gamboni
Proceedings of the 10th International Workshop on Computational Linguistics for Uralic Languages
Enzo Gamboni
Proceedings of the 10th International Workshop on Computational Linguistics for Uralic Languages
For this study, Whisper, an automatic speech recognition software, was fine-tuned on Kildin Sami, an endangered and low-resource Uralic language, using an automatic speech recognition-tailored dataset of less than 30 minutes. Three different Whisper models were trained with this dataset—each one with a different base language (English, Finnish, or Russian)—to examine which model provided the best result. Results were measured using Word Error Rate; fine-tuning the Russian-base Whisper model resulted in the lowest Word Error Rate at 68.55%. While still high, this result is impressive for only a small amount of language-specific training data, and the training process yielded insights relevant for potential for further work.