Arabic Speech Recognition of zero-resourced Languages: A case of Shehri (Jibbali) Language

Norah A. Alrashoudi, Omar Said Alshahri, Hend Al-Khalifa


Abstract
Many under-resourced languages lack computational resources for automatic speech recognition (ASR) due to data scarcity issues. This makes developing accurate ASR models challenging. Shehri or Jibbali, spoken in Oman, lacks extensive annotated speech data. This paper aims to improve an ASR model for this under-resourced language. We collected a Shehri (Jibbali) speech corpus and utilized transfer learning by fine-tuning pre-trained ASR models on this dataset. Specifically, models like Wav2Vec2.0, HuBERT and Whisper were fine-tuned using techniques like parameter-efficient fine-tuning. Evaluation using word error rate (WER) and character error rate (CER) showed that the Whisper model, fine-tuned on the Shehri (Jibbali) dataset, significantly outperformed other models, with the best results from Whisper-medium achieving 3.5% WER. This demonstrates the effectiveness of transfer learning for resource-constrained tasks, showing high zero-shot performance of pre-trained models.
Anthology ID:
2024.osact-1.10
Volume:
Proceedings of the 6th Workshop on Open-Source Arabic Corpora and Processing Tools (OSACT) with Shared Tasks on Arabic LLMs Hallucination and Dialect to MSA Machine Translation @ LREC-COLING 2024
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Hend Al-Khalifa, Kareem Darwish, Hamdy Mubarak, Mona Ali, Tamer Elsayed
Venues:
OSACT | WS
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
84–92
Language:
URL:
https://aclanthology.org/2024.osact-1.10
DOI:
Bibkey:
Cite (ACL):
Norah A. Alrashoudi, Omar Said Alshahri, and Hend Al-Khalifa. 2024. Arabic Speech Recognition of zero-resourced Languages: A case of Shehri (Jibbali) Language. In Proceedings of the 6th Workshop on Open-Source Arabic Corpora and Processing Tools (OSACT) with Shared Tasks on Arabic LLMs Hallucination and Dialect to MSA Machine Translation @ LREC-COLING 2024, pages 84–92, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Arabic Speech Recognition of zero-resourced Languages: A case of Shehri (Jibbali) Language (Alrashoudi et al., OSACT-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.osact-1.10.pdf