Experiments on Speech Synthesis for Teochew, Can Taiwanese Help ?

Pierre Magistry, Ilaine Wang, Ty Eng Lim


Abstract
This paper reports on our preliminary experiments in speech processing for Teochew, an under-resourced Sinitic language spoken both in China and around the world in diasporan communities. Following the recent uptick of interest in Teochew from heritage speakers of the diaspora and in order to respond to the needs of this community, we develop a Teochew Text-to-Speech system. We describe experiments to build this system and to assess the possible contribution of available resources in Taiwanese Hokkien, the closest language with a significant body of resources. The results of these experiments are not as conclusive as we expected: the Taiwanese dataset did not help our model significantly, but considering our objectives, we find it encouraging that they show that a large training dataset was not necessary for this precise task. A promising model could still be obtained with only a small dataset of Teochew. We hope that this work inspires other communities of speakers of languages in a revitalization phase.
Anthology ID:
2024.lrec-main.598
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
6849–6854
Language:
URL:
https://aclanthology.org/2024.lrec-main.598
DOI:
Bibkey:
Cite (ACL):
Pierre Magistry, Ilaine Wang, and Ty Eng Lim. 2024. Experiments on Speech Synthesis for Teochew, Can Taiwanese Help ?. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 6849–6854, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Experiments on Speech Synthesis for Teochew, Can Taiwanese Help ? (Magistry et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.lrec-main.598.pdf