Optimizing Whisper Parameters and Training Data Processing for Formosa Speech Recognition Challenge 2025 - Hakka ASR II

Jhen-Hao Lee; Sheng-Wei Kuo; An-Che Cheng; Bing-Hua Chen; Yi-An Liu

Optimizing Whisper Parameters and Training Data Processing for Formosa Speech Recognition Challenge 2025 - Hakka ASR II

Jhen-Hao Lee, Sheng-Wei Kuo, An-Che Cheng, Bing-Hua Chen, Yi-An Liu

Abstract

This paper presents the development and experimental process of our system for the Formosa Speech Recognition Challenge 2025 (Hakka ASR). The proposed system is built upon the OpenAI Whisper model. We achieved significant performance improvements for the Sixian dialect of Hakka through dataset preprocessing and model fine-tuning. In the warm-up evaluation, our system achieved a Character Error Rate (CER) of 10.51% on the character recognition track and a Syllable Error Rate (SER) of 14.72% on the pinyin recognition track. In the final evaluation, our system achieved a Character Error Rate (CER) of 11.21% on the character recognition track and a Syllable Error Rate (SER) of 15.08% on the pinyin recognition track.

Anthology ID:: 2025.rocling-main.56
Volume:: Proceedings of the 37th Conference on Computational Linguistics and Speech Processing (ROCLING 2025)
Month:: November
Year:: 2025
Address:: National Taiwan University, Taipei City, Taiwan
Editors:: Kai-Wei Chang, Ke-Han Lu, Chih-Kai Yang, Zhi-Rui Tam, Wen-Yu Chang, Chung-Che Wang
Venue:: ROCLING
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 471–475
Language:
URL:: https://aclanthology.org/2025.rocling-main.56/
DOI:
Bibkey:
Cite (ACL):: Jhen-Hao Lee, Sheng-Wei Kuo, An-Che Cheng, Bing-Hua Chen, and Yi-An Liu. 2025. Optimizing Whisper Parameters and Training Data Processing for Formosa Speech Recognition Challenge 2025 - Hakka ASR II. In Proceedings of the 37th Conference on Computational Linguistics and Speech Processing (ROCLING 2025), pages 471–475, National Taiwan University, Taipei City, Taiwan. Association for Computational Linguistics.
Cite (Informal):: Optimizing Whisper Parameters and Training Data Processing for Formosa Speech Recognition Challenge 2025 - Hakka ASR II (Lee et al., ROCLING 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.rocling-main.56.pdf

PDF Cite Search Fix data