Markus Müller


2018

pdf bib
KIT Lecture Translator: Multilingual Speech Translation with One-Shot Learning
Florian Dessloch | Thanh-Le Ha | Markus Müller | Jan Niehues | Thai-Son Nguyen | Ngoc-Quan Pham | Elizabeth Salesky | Matthias Sperber | Sebastian Stüker | Thomas Zenkel | Alexander Waibel
Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations

In today’s globalized world we have the ability to communicate with people across the world. However, in many situations the language barrier still presents a major issue. For example, many foreign students coming to KIT to study are initially unable to follow a lecture in German. Therefore, we offer an automatic simultaneous interpretation service for students. To fulfill this task, we have developed a low-latency translation system that is adapted to lectures and covers several language pairs. While the switch from traditional Statistical Machine Translation to Neural Machine Translation (NMT) significantly improved performance, to integrate NMT into the speech translation framework required several adjustments. We have addressed the run-time constraints and different types of input. Furthermore, we utilized one-shot learning to easily add new topic-specific terms to the system. Besides better performance, NMT also enabled us increase our covered languages through multilingual NMT. % Combining these techniques, we are able to provide an adapted speech translation system for several European languages.

pdf bib
BULBasaa: A Bilingual Basaa-French Speech Corpus for the Evaluation of Language Documentation Tools
Fatima Hamlaoui | Emmanuel-Moselly Makasso | Markus Müller | Jonas Engelmann | Gilles Adda | Alex Waibel | Sebastian Stüker
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2016

pdf bib
Evaluation of the KIT Lecture Translation System
Markus Müller | Sarah Fünfer | Sebastian Stüker | Alex Waibel
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

To attract foreign students is among the goals of the Karlsruhe Institute of Technology (KIT). One obstacle to achieving this goal is that lectures at KIT are usually held in German which many foreign students are not sufficiently proficient in, as, e.g., opposed to English. While the students from abroad are learning German during their stay at KIT, it is challenging to become proficient enough in it in order to follow a lecture. As a solution to this problem we offer our automatic simultaneous lecture translation. It translates German lectures into English in real time. While not as good as human interpreters, the system is available at a price that KIT can afford in order to offer it in potentially all lectures. In order to assess whether the quality of the system we have conducted a user study. In this paper we present this study, the way it was conducted and its results. The results indicate that the quality of the system has passed a threshold as to be able to support students in their studies. The study has helped to identify the most crucial weaknesses of the systems and has guided us which steps to take next.

pdf bib
Lecture Translator - Speech translation framework for simultaneous lecture translation
Markus Müller | Thai Son Nguyen | Jan Niehues | Eunah Cho | Bastian Krüger | Thanh-Le Ha | Kevin Kilgour | Matthias Sperber | Mohammed Mediani | Sebastian Stüker | Alex Waibel
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations

2015

pdf bib
Evaluation of Crowdsourced User Input Data for Spoken Dialog Systems
Maria Schmidt | Markus Müller | Martin Wagner | Sebastian Stüker | Alex Waibel | Hansjörg Hofmann | Steffen Werner
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue

2014

pdf bib
The 2014 KIT IWSLT speech-to-text systems for English, German and Italian
Kevin Kilgour | Michael Heck | Markus Müller | Matthias Sperber | Sebastian Stüker | Alex Waibel
Proceedings of the 11th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper describes our German, Italian and English Speech-to-Text (STT) systems for the 2014 IWSLT TED ASR track. Our setup uses ROVER and confusion network combination from various subsystems to achieve a good overall performance. The individual subsystems are built by using different front-ends, (e.g., MVDR-MFCC or lMel), acoustic models (GMM or modular DNN) and phone sets and by training on various subsets of the training data. Decoding is performed in two stages, where the GMM systems are adapted in an unsupervised manner on the combination of the first stage outputs using VTLN, MLLR, and cMLLR. The combination setup produces a final hypothesis that has a significantly lower WER than any of the individual subsystems.

pdf bib
Multilingual deep bottle neck features: a study on language selection and training techniques
Markus Müller | Sebastian Stüker | Zaid Sheikh | Florian Metze | Alex Waibel
Proceedings of the 11th International Workshop on Spoken Language Translation: Papers

Previous work has shown that training the neural networks for bottle neck feature extraction in a multilingual way can lead to improvements in word error rate and average term weighted value in a telephone key word search task. In this work we conduct a systematic study on a) which multilingual training strategy to employ, b) the effect of language selection and amount of multilingual training data used and c) how to find a suitable combination for languages. We conducted our experiment on the key word search task and the languages of the IARPA BABEL program. In a first step, we assessed the performance of a single language out of all available languages in combination with the target language. Based on these results, we then combined a multitude of languages. We also examined the influence of the amount of training data per language, as well as different techniques for combining the languages during network training. Our experiments show that data from arbitrary additional languages does not necessarily increase the performance of a system. But when combining a suitable set of languages, a significant gain in performance can be achieved.

2013

pdf bib
The 2013 KIT IWSLT speech-to-text systems for German and English
Kevin Kilgour | Christian Mohr | Michael Heck | Quoc Bao Nguyen | Van Huy Nguyen | Evgeniy Shin | Igor Tseyzer | Jonas Gehring | Markus Müller | Matthias Sperber | Sebastian Stüker | Alex Waibel
Proceedings of the 10th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper describes our English Speech-to-Text (STT) systems for the 2013 IWSLT TED ASR track. The systems consist of multiple subsystems that are combinations of different front-ends, e.g. MVDR-MFCC based and lMel based ones, GMM and NN acoustic models and different phone sets. The outputs of the subsystems are combined via confusion network combination. Decoding is done in two stages, where the systems of the second stage are adapted in an unsupervised manner on the combination of the first stage outputs using VTLN, MLLR, and cMLLR.