Maximum entropy language modeling for Russian ASR
Evgeniy Shin | Sebastian Stüker | Kevin Kilgour | Christian Fügen | Alex Waibel
Proceedings of the 10th International Workshop on Spoken Language Translation: Papers
Russian is a challenging language for automatic speech recognition systems due to its rich morphology. This rich morphology stems from Russian’s highly inflectional nature and the frequent use of preand suffixes. Also, Russian has a very free word order, changes in which are used to reflect connotations of the sentences. Dealing with these phenomena is rather difficult for traditional n-gram models. We therefore investigate in this paper the use of a maximum entropy language model for Russian whose features are specifically designed to deal with the inflections in Russian, as well as the loose word order. We combine this with a subword based language model in order to alleviate the problem of large vocabulary sizes necessary for dealing with highly inflecting languages. Applying the maximum entropy language model during re-scoring improves the word error rate of our recognition system by 1.2% absolute, while the use of the sub-word based language model reduces the vocabulary size from 120k to 40k and the OOV rate from 4.8% to 2.1%.
End-to-End Evaluation in Simultaneous Translation
Olivier Hamon | Christian Fügen | Djamel Mostefa | Victoria Arranz | Muntsin Kolss | Alex Waibel | Khalid Choukri
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)
LingWear: A Mobile Tourist Information System
Christian Fügen | Martin Westphal | Mike Schneider | Tanja Schultz | Alex Waibel
Proceedings of the First International Conference on Human Language Technology Research
- Alex Waibel 3
- Evgeniy Shin 1
- Sebastian Stüker 1
- Kevin Kilgour 1
- Olivier Hamon 1
- show all...