Li-Yu Huang


2022

pdf bib
Development of Mandarin-English code-switching speech synthesis system
Hsin-Jou Lien | Li-Yu Huang | Chia-Ping Chen
Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022)

In this paper, the Mandarin-English code-switching speech synthesis system has been proposed. To focus on learning the content information between two languages, the training dataset is multilingual artificial dataset whose speaker style is unified. Adding language embedding into the system helps it be more adaptive to multilingual dataset. Besides, text preprocessing is applied and be used in different way which depends on the languages. Word segmentation and text-to-pinyin are the text preprocessing for Mandarin, which not only improves the fluency but also reduces the learning complexity. Number normalization decides whether the arabic numerals in sentence needs to add the digits. The preprocessing for English is acronym conversion which decides the pronunciation of acronym.