Acquiring Pronunciation Data for a Placenames Lexicon in a Less-Resourced Language

Briony Williams, Rhys James Jones


Abstract
A new procedure is described for generating pronunciations for a dictionary of place-names in a less-resourced language (Welsh, spoken in Wales, UK). The method is suitable for use in a situation where there is a lack of skilled phoneticians with expertise in the language, but where there are native speakers available, as well as a text-to-speech synthesiser for the language. The lack of skilled phoneticians will make it impossible to carry out direct editing of pronunciations, and so a method has been devised that makes it possible for non-phonetician native speakers to edit pronunciations without knowledge of the phonology of the language. The key advance in this method is the use of “re-spelling” to indicate pronunciation in a linguistically-naïve fashion on the part of the non-specialist native speaker. The “re-spelled” forms of placenames are used to drive a set of specially-adapted letter-to-sound rules, which generate the pronunciations desired. The speech synthesiser is used to provide audio feedback to the native speaker editor for purposes of verification. A graphical user interface acts as the link between the database, the speech synthesiser and the native speaker editor. This method has been used successfully to generate pronunciations for placenames in Wales.
Anthology ID:
L08-1477
Volume:
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Month:
May
Year:
2008
Address:
Marrakech, Morocco
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/55_paper.pdf
DOI:
Bibkey:
Cite (ACL):
Briony Williams and Rhys James Jones. 2008. Acquiring Pronunciation Data for a Placenames Lexicon in a Less-Resourced Language. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08), Marrakech, Morocco. European Language Resources Association (ELRA).
Cite (Informal):
Acquiring Pronunciation Data for a Placenames Lexicon in a Less-Resourced Language (Williams & Jones, LREC 2008)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/55_paper.pdf