Jim Talley
2006
Bootstrapping New Language ASR Capabilities: Achieving Best Letter-to-Sound Performance under Resource Constraints
Jim Talley
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
One of the most critical components in the process of building automatic speech recognition (ASR) capabilities for a new language is the lexicon, or pronouncing dictionary. For practical reasons, it is desirable to manually create only the minimal lexicon using available native-speaker phonetic expertise and, then, use the resulting seed lexicon for machine learning based induction of a high-quality letter-to-sound (L2S) model for generation of pronunciations for the remaining words of the language. This paper examines the viability of this scenario, specifically investigating three possible strategies for selection of lexemes (words) for manual transcription choosing the most frequent lexemes of the language, choosing lexemes randomly, and selection of lexemes via an information theoretic diversity measure. The relative effectiveness of these three strategies is evaluated as a function of the number of lexemes to be transcribed to create a bootstrapping lexicon. Generally, the newly developed orthographic diversity based selection strategy outperforms the others for this scenario where a limited number of lexemes can be transcribed. The experiments also provide generally useful insight into expected L2S accuracy sacrifice as a function of decreasing training set size.
2000
The Establishment of Motorola’s Human Language Data Resource Center: Addressing the Criticality of Language Resources in the Industrial Setting
Jim Talley
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)