Requirements and Motivations of Low-Resource Speech Synthesis for Language Revitalization

Aidan Pine, Dan Wells, Nathan Brinklow, Patrick Littell, Korin Richmond


Abstract
This paper describes the motivation and development of speech synthesis systems for the purposes of language revitalization. By building speech synthesis systems for three Indigenous languages spoken in Canada, Kanien’kéha, Gitksan & SENĆOŦEN, we re-evaluate the question of how much data is required to build low-resource speech synthesis systems featuring state-of-the-art neural models. For example, preliminary results with English data show that a FastSpeech2 model trained with 1 hour of training data can produce speech with comparable naturalness to a Tacotron2 model trained with 10 hours of data. Finally, we motivate future research in evaluation and classroom integration in the field of speech synthesis for language revitalization.
Anthology ID:
2022.acl-long.507
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7346–7359
Language:
URL:
https://aclanthology.org/2022.acl-long.507
DOI:
10.18653/v1/2022.acl-long.507
Award:
 Best Special Theme Paper
Bibkey:
Cite (ACL):
Aidan Pine, Dan Wells, Nathan Brinklow, Patrick Littell, and Korin Richmond. 2022. Requirements and Motivations of Low-Resource Speech Synthesis for Language Revitalization. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7346–7359, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Requirements and Motivations of Low-Resource Speech Synthesis for Language Revitalization (Pine et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.507.pdf
Code
 roedoejet/fastspeech2