Speech Personalization using Parameter Efficient Fine-Tuning for Nepali Speakers

Kiran Pantha, Rupak Raj Ghimire, Bal Krishna Bal


Abstract
The performance of Automatic Speech Recognition (ASR) systems has improved significantly, driven by advancements in large-scale pre-trained models. However, adapting such models to low-resource languages such as Nepali is challenging due to the lack of labeled data and computational resources. Additionally, adapting the unique speech parameters of the speaker to a model is also a challenging task. Personalization helps to target the model to fit the particular speaker. This work investigates parameter-efficient fine-tuning (PEFT) methods like Low-Rank Adaptation (LoRA) and Decomposed Weight Low-Rank Adaptation (DoRA) to improve the performance of fine-tuned Whisper ASR models for Nepali ASR tasks by Personalization. These experiments demonstrate that the PEFT methods obtain competitive results while significantly reducing the number of trainable parameters compared to full fine-tuning. LoRA and DoRA show a relative WER to FTBase increment of 34.93% and 36.79%, respectively, and a relative CER to FTBase increment of 49.50% and 50.03%, respectively. Furthermore, the results highlight a 99.74% reduction in total training parameters.
Anthology ID:
2025.ltedi-1.31
Volume:
Proceedings of the 5th Conference on Language, Data and Knowledge: Fifth Workshop on Language Technology for Equality, Diversity, Inclusion
Month:
September
Year:
2025
Address:
Naples, Italy
Editors:
Katerina Gkirtzou, Slavko Žitnik, Jorge Gracia, Dagmar Gromann, Maria Pia di Buono, Johanna Monti, Maxim Ionov
Venues:
LTEDI | WS
SIG:
Publisher:
Unior Press
Note:
Pages:
190–199
Language:
URL:
https://aclanthology.org/2025.ltedi-1.31/
DOI:
Bibkey:
Cite (ACL):
Kiran Pantha, Rupak Raj Ghimire, and Bal Krishna Bal. 2025. Speech Personalization using Parameter Efficient Fine-Tuning for Nepali Speakers. In Proceedings of the 5th Conference on Language, Data and Knowledge: Fifth Workshop on Language Technology for Equality, Diversity, Inclusion, pages 190–199, Naples, Italy. Unior Press.
Cite (Informal):
Speech Personalization using Parameter Efficient Fine-Tuning for Nepali Speakers (Pantha et al., LTEDI 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.ltedi-1.31.pdf