Nils Hjortnaes


2024

Pronunciation of the phonemic inventory of a new language often presents difficulties to second language (L2) learners. These challenges can be alleviated by the development of pronunciation feedback tools that take speech input from learners and return information about errors in the utterance. This paper presents the development of a corpus designed for use in pronunciation feedback research. The corpus is comprised of gold standard recordings from isiZulu teachers and recordings from isiZulu L2 learners that have been annotated for pronunciation errors. Exploring the potential benefits of word-level versus phoneme-level feedback necessitates a speech corpus that has been annotated for errors on the phoneme-level. To aid in this discussion, this corpus of isiZulu L2 speech has been annotated for phoneme-errors in utterances, as well as suprasegmental errors in tone.

2021

2020

In this paper, we expand on previous work on automatic speech recognition in a low-resource scenario typical of data collected by field linguists. We train DeepSpeech models on 35 hours of dialectal Komi speech recordings and correct the output using language models constructed from various sources. Previous experiments showed that transfer learning using DeepSpeech can improve the accuracy of a speech recognizer for Komi, though the error rate remained very high. In this paper we present further experiments with language models created using KenLM from text materials available online. These are constructed from two corpora, one containing literary texts, one for social media content, and another combining the two. We then trained the model using each language model to explore the impact of the language model data source on the speech recognition model. Our results show significant improvements of over 25% in character error rate and nearly 20% in word error rate. This offers important methodological insight into how ASR results can be improved under low-resource conditions: transfer learning can be used to compensate the lack of training data in the target language, and online texts are a very useful resource when developing language models in this context.