Enzo Gamboni


2025

pdf bib
Fine-Tuning Whisper for Kildin Sami
Enzo Gamboni
Proceedings of the 10th International Workshop on Computational Linguistics for Uralic Languages

For this study, Whisper, an automatic speech recognition software, was fine-tuned on Kildin Sami, an endangered and low-resource Uralic language, using an automatic speech recognition-tailored dataset of less than 30 minutes. Three different Whisper models were trained with this dataset—each one with a different base language (English, Finnish, or Russian)—to examine which model provided the best result. Results were measured using Word Error Rate; fine-tuning the Russian-base Whisper model resulted in the lowest Word Error Rate at 68.55%. While still high, this result is impressive for only a small amount of language-specific training data, and the training process yielded insights relevant for potential for further work.
Search
Co-authors
    Venues
    Fix author