Lukáš Žilka

Also published as: Lukas Zilka


2014

pdf bib
Free English and Czech telephone speech corpus shared under the CC-BY-SA 3.0 license
Matěj Korvas | Ondřej Plátek | Ondřej Dušek | Lukáš Žilka | Filip Jurčíček
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

We present a dataset of telephone conversations in English and Czech, developed for training acoustic models for automatic speech recognition (ASR) in spoken dialogue systems (SDSs). The data comprise 45 hours of speech in English and over 18 hours in Czech. Large part of the data, both audio and transcriptions, was collected using crowdsourcing, the rest are transcriptions by hired transcribers. We release the data together with scripts for data pre-processing and building acoustic models using the HTK and Kaldi ASR toolkits. We publish also the trained models described in this paper. The data are released under the CC-BY-SA 3.0 license, the scripts are licensed under Apache 2.0. In the paper, we report on the methodology of collecting the data, on the size and properties of the data, and on the scripts and their use. We verify the usability of the datasets by training and evaluating acoustic models using the presented data and scripts.

pdf bib
Alex: Bootstrapping a Spoken Dialogue System for a New Domain by Real Users
Ondřej Dušek | Ondřej Plátek | Lukáš Žilka | Filip Jurčíček
Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL)

2013

pdf bib
Comparison of Bayesian Discriminative and Generative Models for Dialogue State Tracking
Lukáš Žilka | David Marek | Matěj Korvas | Filip Jurčíček
Proceedings of the SIGDIAL 2013 Conference

2011

pdf bib
Using Explicit Semantic Analysis for Cross-Lingual Link Discovery
Petr Knoth | Lukas Zilka | Zdenek Zdrahal
Proceedings of the Fifth International Workshop On Cross Lingual Information Access