Acquiring Pronunciation Data for a Placenames Lexicon in a Less-Resourced Language
Rhys James Jones
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
A new procedure is described for generating pronunciations for a dictionary of place-names in a less-resourced language (Welsh, spoken in Wales, UK). The method is suitable for use in a situation where there is a lack of skilled phoneticians with expertise in the language, but where there are native speakers available, as well as a text-to-speech synthesiser for the language. The lack of skilled phoneticians will make it impossible to carry out direct editing of pronunciations, and so a method has been devised that makes it possible for non-phonetician native speakers to edit pronunciations without knowledge of the phonology of the language. The key advance in this method is the use of re-spelling to indicate pronunciation in a linguistically-naïve fashion on the part of the non-specialist native speaker. The re-spelled forms of placenames are used to drive a set of specially-adapted letter-to-sound rules, which generate the pronunciations desired. The speech synthesiser is used to provide audio feedback to the native speaker editor for purposes of verification. A graphical user interface acts as the link between the database, the speech synthesiser and the native speaker editor. This method has been used successfully to generate pronunciations for placenames in Wales.
Tools and resources for speech synthesis arising from a Welsh TTS project
Rhys James Jones
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
The WISPR project ("Welsh and Irish Speech Processing Resources") has been building text-to-speech synthesis systems for Welsh and for Irish, as well as building links between the developers and potential users of the software. The Welsh half of the project has encountered various challenges, in the areas of the tokenisation of input text, the formatting of letter-to-sound rules, and the implementation of the "greedy algorithm" for text selection. The solutions to these challenges have resulted in various tools which may be of use to other developers using Festival for TTS for other languages. These resources are made freely available.
Book Review: Progress in Speech Synthesis
Computational-Linguistics, Volume 24, Number 3, September 1998
Analysis of Unknown Words through Morphological Decomposition
Alan W. Black
Joke van de Plassche
Fifth Conference of the European Chapter of the Association for Computational Linguistics