Towards Optimal TTS Corpora

Didier Cadic, Cédric Boidin, Christophe d’Alessandro


Abstract
Unit selection text-to-speech systems currently produce very natural synthesized phrases by concatenating speech segments from a large database. Recently, increasing demand for designing high quality voices with less data has created need for further optimization of the textual corpus recorded by the speaker. This corpus is traditionally the result of a condensation process: sentences are selected from a reference corpus, using an optimization algorithm (generally greedy) guided by the coverage rate of classic units (diphones, triphones, words…). Such an approach is, however, strongly constrained by the finite content of the reference corpus, providing limited language possibilities. To gain flexibility in the optimization process, in this paper, we introduce a new corpus building procedure based on sentence construction rather than sentence selection. Sentences are generated using Finite State Transducers, assisted by a human operator and guided by a new frequency-weighted coverage criterion based on Vocalic Sandwiches. This semi-automatic process requires time-consuming human intervention but seems to give access to much denser corpora, with a density increase of 30 to 40% for a given coverage rate.
Anthology ID:
L10-1415
Volume:
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Month:
May
Year:
2010
Address:
Valletta, Malta
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/608_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Didier Cadic, Cédric Boidin, and Christophe d’Alessandro. 2010. Towards Optimal TTS Corpora. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
Cite (Informal):
Towards Optimal TTS Corpora (Cadic et al., LREC 2010)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/608_Paper.pdf