Transfer Learning for Less-Resourced Semitic Languages Speech Recognition: the Case of Amharic

Yonas Woldemariam


Abstract
While building automatic speech recognition (ASR) requires a large amount of speech and text data, the problem gets worse for less-resourced languages. In this paper, we investigate a model adaptation method, namely transfer learning for a less-resourced Semitic language i.e., Amharic, to solve resource scarcity problems in speech recognition development and improve the Amharic ASR model. In our experiments, we transfer acoustic models trained on two different source languages (English and Mandarin) to Amharic using very limited resources. The experimental results show that a significant WER (Word Error Rate) reduction has been achieved by transferring the hidden layers of the trained source languages neural networks. In the best case scenario, the Amharic ASR model adapted from English yields the best WER reduction from 38.72% to 24.50% (an improvement of 14.22% absolute). Adapting the Mandarin model improves the baseline Amharic model with a WER reduction of 10.25% (absolute). Our analysis also reveals that, the speech recognition performance of the adapted acoustic model is highly influenced by the relatedness (in a relative sense) between the source and the target languages than other considered factors (e.g. the quality of source models). Furthermore, other Semitic as well as Afro-Asiatic languages could benefit from the methodology presented in this study.
Anthology ID:
2020.sltu-1.9
Volume:
Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL)
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Dorothee Beermann, Laurent Besacier, Sakriani Sakti, Claudia Soria
Venue:
SLTU
SIG:
Publisher:
European Language Resources association
Note:
Pages:
61–69
Language:
English
URL:
https://aclanthology.org/2020.sltu-1.9
DOI:
Bibkey:
Cite (ACL):
Yonas Woldemariam. 2020. Transfer Learning for Less-Resourced Semitic Languages Speech Recognition: the Case of Amharic. In Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL), pages 61–69, Marseille, France. European Language Resources association.
Cite (Informal):
Transfer Learning for Less-Resourced Semitic Languages Speech Recognition: the Case of Amharic (Woldemariam, SLTU 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.sltu-1.9.pdf