Preeti Rao


2024

pdf bib
STORiCo: Storytelling TTS for Hindi with Character Voice Modulation
Pavan Tankala | Preethi Jyothi | Preeti Rao | Pushpak Bhattacharyya
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 2: Short Papers)

We present a new Hindi text-to-speech (TTS) dataset and demonstrate its utility for the expressive synthesis of children’s audio stories. The dataset comprises narration by a single female speaker who modifies her voice to produce different story characters. Annotation for dialogue identification, character labelling, and character attribution are provided, all of which are expected to facilitate the learning of character voice and speaking styles. Experiments are conducted using different versions of the annotated dataset that enable training a multi-speaker TTS model on the single-speaker data. Subjective tests show that the multi-speaker model improves expressiveness and character voice consistency compared to the baseline single-speaker TTS. With the multi-speaker model, objective evaluations show comparable word error rates, better speaker voice consistency, and higher correlations with ground-truth emotion attributes. We release a new 16.8 hours storytelling speech dataset in Hindi and propose effective solutions for expressive TTS with narrator voice modulation and character voice consistency.

2012

pdf bib
Automatic pronunciation assessment for language learners with acoustic-phonetic features
Vaishali Patil | Preeti Rao
Proceedings of the Workshop on Speech and Language Processing Tools in Education