Developing Speech Synthesis for Under-Resourced Languages by “Faking it”: An Experiment with Somali

Harold Somers, Gareth Evans, Zeinab Mohamed


Abstract
Speech synthesis or text-to-speech (TTS) systems are currently available for a number of the world's major languages, but for thousands of other, unsupported, languages no such technology is available. While awaiting the development of such technology, we propose using an existing TTS system for a major language (the base language, BL) to "fake" TTS for an unsupported language (the target language, TL). This paper describes the factors which determine the choice of a suitable BL for a given TL, and describe an experiment with a fake Somali TTS system evaluated in the real-life situation of a doctor–patient dialogue. 28 Somali participants were asked to judge the comprehensibility of 25 short Somali sentences recorded with a German TTS system. Results suggest that "faking it" provides reasonable stop-gap TTS for unsupported languages.
Anthology ID:
L06-1289
Volume:
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Month:
May
Year:
2006
Address:
Genoa, Italy
Editors:
Nicoletta Calzolari, Khalid Choukri, Aldo Gangemi, Bente Maegaard, Joseph Mariani, Jan Odijk, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/483_pdf.pdf
DOI:
Bibkey:
Cite (ACL):
Harold Somers, Gareth Evans, and Zeinab Mohamed. 2006. Developing Speech Synthesis for Under-Resourced Languages by “Faking it”: An Experiment with Somali. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
Cite (Informal):
Developing Speech Synthesis for Under-Resourced Languages by “Faking it”: An Experiment with Somali (Somers et al., LREC 2006)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/483_pdf.pdf