OkwuGbé: End-to-End Speech Recognition for Fon and Igbo

Bonaventure F. P. Dossou, Chris Chinenye Emezue


Abstract
Language is a fundamental component of human communication. African low-resourced languages have recently been a major subject of research in machine translation, and other text-based areas of NLP. However, there is still very little comparable research in speech recognition for African languages. OkwuGbé is a step towards building speech recognition systems for African low-resourced languages. Using Fon and Igbo as our case study, we build two end-to-end deep neural network-based speech recognition models. We present a state-of-the-art automatic speech recognition (ASR) model for Fon, and a benchmark ASR model result for Igbo. Our findings serve both as a guide for future NLP research for Fon and Igbo in particular, and the creation of speech recognition models for other African low-resourced languages in general. The Fon and Igbo models source code have been made publicly available. Moreover, Okwugbe, a python library has been created to make easier the process of ASR model building and training.
Anthology ID:
2021.winlp-1.1
Volume:
Proceedings of the Fifth Workshop on Widening Natural Language Processing
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic
Editors:
Erika Varis, Ryan Georgi, Alicia Tsai, Antonios Anastasopoulos, Kyathi Chandu, Xanda Schofield, Surangika Ranathunga, Haley Lepp, Tirthankar Ghosal
Venue:
WiNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–4
Language:
URL:
https://aclanthology.org/2021.winlp-1.1
DOI:
Bibkey:
Cite (ACL):
Bonaventure F. P. Dossou and Chris Chinenye Emezue. 2021. OkwuGbé: End-to-End Speech Recognition for Fon and Igbo. In Proceedings of the Fifth Workshop on Widening Natural Language Processing, pages 1–4, Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
OkwuGbé: End-to-End Speech Recognition for Fon and Igbo (Dossou & Emezue, WiNLP 2021)
Copy Citation: