Fully Convolutional ASR for Less-Resourced Endangered Languages

Bao Thai, Robert Jimerson, Raymond Ptucha, Emily Prud’hommeaux


Abstract
The application of deep learning to automatic speech recognition (ASR) has yielded dramatic accuracy increases for languages with abundant training data, but languages with limited training resources have yet to see accuracy improvements on this scale. In this paper, we compare a fully convolutional approach for acoustic modelling in ASR with a variety of established acoustic modeling approaches. We evaluate our method on Seneca, a low-resource endangered language spoken in North America. Our method yields word error rates up to 40% lower than those reported using both standard GMM-HMM approaches and established deep neural methods, with a substantial reduction in training time. These results show particular promise for languages like Seneca that are both endangered and lack extensive documentation.
Anthology ID:
2020.sltu-1.17
Volume:
Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL)
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Dorothee Beermann, Laurent Besacier, Sakriani Sakti, Claudia Soria
Venue:
SLTU
SIG:
Publisher:
European Language Resources association
Note:
Pages:
126–130
Language:
English
URL:
https://aclanthology.org/2020.sltu-1.17
DOI:
Bibkey:
Cite (ACL):
Bao Thai, Robert Jimerson, Raymond Ptucha, and Emily Prud’hommeaux. 2020. Fully Convolutional ASR for Less-Resourced Endangered Languages. In Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL), pages 126–130, Marseille, France. European Language Resources association.
Cite (Informal):
Fully Convolutional ASR for Less-Resourced Endangered Languages (Thai et al., SLTU 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.sltu-1.17.pdf