A Whisper-Based System with Multi-Faceted Data Augmentation for Low-Resource Language

Pin-Cheng Chen; Yu-Chi Chen; Chia-Chun Liang; Cheng-Yu Lin; Ping-Juei Tsai; Wei-Yun Ma

A Whisper-Based System with Multi-Faceted Data Augmentation for Low-Resource Language

Pin-Cheng Chen, Yu-Chi Chen, Chia-Chun Liang, Cheng-Yu Lin, Ping-Juei Tsai, Wei-Yun Ma

Abstract

This paper presents a comprehensive approach for the Formosa Speech Recognition Challenge 2025 (FSR-2025), targeting automatic speech recognition (ASR) for the under-resourced Dapu and Zhao’an dialects of Taiwanese Hakka. Our method integrates data augmentation and robustness techniques, including SpecAugment, dialect-aware special tokens, text-to-speech (TTS) augmentation, noise/reverberation mixing, and speed perturbation, to mitigate data scarcity and domain mismatch. Experiments on the official FSR-2025 datasets show consistent improvements in both character error rate (CER) and word error rate (WER). Extensive ablation studies further confirm that each component contributes positively. These results offer a practical path toward robust ASR for under-resourced Hakka dialects and suggest broader applicability to other low-resource languages.

Anthology ID:: 2025.rocling-main.59
Volume:: Proceedings of the 37th Conference on Computational Linguistics and Speech Processing (ROCLING 2025)
Month:: November
Year:: 2025
Address:: National Taiwan University, Taipei City, Taiwan
Editors:: Kai-Wei Chang, Ke-Han Lu, Chih-Kai Yang, Zhi-Rui Tam, Wen-Yu Chang, Chung-Che Wang
Venue:: ROCLING
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 489–498
Language:
URL:: https://aclanthology.org/2025.rocling-main.59/
DOI:
Bibkey:
Cite (ACL):: Pin-Cheng Chen, Yu-Chi Chen, Chia-Chun Liang, Cheng-Yu Lin, Ping-Juei Tsai, and Wei-Yun Ma. 2025. A Whisper-Based System with Multi-Faceted Data Augmentation for Low-Resource Language. In Proceedings of the 37th Conference on Computational Linguistics and Speech Processing (ROCLING 2025), pages 489–498, National Taiwan University, Taipei City, Taiwan. Association for Computational Linguistics.
Cite (Informal):: A Whisper-Based System with Multi-Faceted Data Augmentation for Low-Resource Language (Chen et al., ROCLING 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.rocling-main.59.pdf

PDF Cite Search Fix data