Min Han Teng


2025

pdf bib
Whisper Finetuning For Hakka Recognition in Low Resource
Min Han Teng | Ci Dao Chen | You Ting Lin | Bing Jhih Huang
Proceedings of the 37th Conference on Computational Linguistics and Speech Processing (ROCLING 2025)

We study automatic speech recognition (ASR) for Hakka, a low-resource language with substantial dialectal variation. Focusing on Zhaoan and Dapu, we fine-tune Whisper using Low-Rank Adaptation (LoRA) and apply data augmentation to mitigate data scarcity. Experiments show that LoRA combined with augmentation substantially improves cross-dialect recognition while maintaining parameter efficiency. Our results demonstrate the potential of lightweight adaptation to extend large-scale ASR systems to underrepresented languages, supporting the preservation of Hakka speech and orthography.