DNN-Based Multilingual Automatic Speech Recognition for Wolaytta using Oromo Speech

Martha Yifiru Tachbelie, Solomon Teferra Abate, Tanja Schultz


Abstract
It is known that Automatic Speech Recognition (ASR) is very useful for human-computer interaction in all the human languages. However, due to its requirement for a big speech corpus, which is very expensive, it has not been developed for most of the languages. Multilingual ASR (MLASR) has been suggested to share existing speech corpora among related languages to develop an ASR for languages which do not have the required speech corpora. Literature shows that phonetic relatedness goes across language families. We have, therefore, conducted experiments on MLASR taking two language families: one as source (Oromo from Cushitic) and the other as target (Wolaytta from Omotic). Using Oromo Deep Neural Network (DNN) based acoustic model, Wolaytta pronunciation dictionary and language model we have achieved Word Error Rate (WER) of 48.34% for Wolaytta. Moreover, our experiments show that adding only 30 minutes of speech data from the target language (Wolaytta) to the whole training data (22.8 hours) of the source language (Oromo) results in a relative WER reduction of 32.77%. Our results show the possibility of developing ASR system for a language, if we have pronunciation dictionary and language model, using an existing speech corpus of another language irrespective of their language family.
Anthology ID:
2020.sltu-1.37
Volume:
Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL)
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Dorothee Beermann, Laurent Besacier, Sakriani Sakti, Claudia Soria
Venue:
SLTU
SIG:
Publisher:
European Language Resources association
Note:
Pages:
265–270
Language:
English
URL:
https://aclanthology.org/2020.sltu-1.37
DOI:
Bibkey:
Cite (ACL):
Martha Yifiru Tachbelie, Solomon Teferra Abate, and Tanja Schultz. 2020. DNN-Based Multilingual Automatic Speech Recognition for Wolaytta using Oromo Speech. In Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL), pages 265–270, Marseille, France. European Language Resources association.
Cite (Informal):
DNN-Based Multilingual Automatic Speech Recognition for Wolaytta using Oromo Speech (Tachbelie et al., SLTU 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.sltu-1.37.pdf