AccentFold: A Journey through African Accents for Zero-Shot ASR Adaptation to Target Accents

Abraham Owodunni, Aditya Yadavalli, Chris Emezue, Tobi Olatunji, Clinton Mbataku


Abstract
Despite advancements in speech recognition, accented speech remains challenging. While previous approaches have focused on modeling techniques or creating accented speech datasets, gathering sufficient data for the multitude of accents, particularly in the African context, remains impractical due to their sheer diversity and associated budget constraints. To address these challenges, we propose AccentFold, a method that exploits spatial relationships between learned accent embeddings to improve downstream Automatic Speech Recognition (ASR). Our exploratory analysis of speech embeddings representing 100+ African accents reveals interesting spatial accent relationships highlighting geographic and genealogical similarities, capturing consistent phonological, and morphological regularities, all learned empirically from speech. Furthermore, we discover accent relationships previously uncharacterized by the Ethnologue. Through empirical evaluation, we demonstrate the effectiveness of AccentFold by showing that, for out-of-distribution (OOD) accents, sampling accent subsets for training based on AccentFold information outperforms strong baselines a relative WER improvement of 4.6%. AccentFold presents a promising approach for improving ASR performance on accented speech, particularly in the context of African accents, where data scarcity and budget constraints pose significant challenges. Our findings emphasize the potential of leveraging linguistic relationships to improve zero-shot ASR adaptation to target accents.
Anthology ID:
2024.findings-eacl.142
Volume:
Findings of the Association for Computational Linguistics: EACL 2024
Month:
March
Year:
2024
Address:
St. Julian’s, Malta
Editors:
Yvette Graham, Matthew Purver
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2146–2161
Language:
URL:
https://aclanthology.org/2024.findings-eacl.142
DOI:
Bibkey:
Cite (ACL):
Abraham Owodunni, Aditya Yadavalli, Chris Emezue, Tobi Olatunji, and Clinton Mbataku. 2024. AccentFold: A Journey through African Accents for Zero-Shot ASR Adaptation to Target Accents. In Findings of the Association for Computational Linguistics: EACL 2024, pages 2146–2161, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):
AccentFold: A Journey through African Accents for Zero-Shot ASR Adaptation to Target Accents (Owodunni et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-eacl.142.pdf