Van Huy Nguyen


2016

pdf bib
The IOIT English ASR system for IWSLT 2016
Van Huy Nguyen | Trung-Nghia Phung | Tat Thang Vu | Chi Mai Luong
Proceedings of the 13th International Conference on Spoken Language Translation

This paper describes the speech recognition system of IOIT for IWSLT 2016. Four single DNN-based systems were developed to produce the 1st-pass lattices for the test sets using a baseline language model. The 2nd-pass lattices were further obtained by applying N-best list rescoring on topic adapted language models which were constructed from closed topic sentences by applying a text selection method. The final transcriptions of test sets were finally produced by combining the rescored results. On the 2013 evaluation set, we are able to reduce the word error rate of 1.62% absolute. On the 2014, provided as a development set, the word error rate of our transcription is 11.3%.

2015

pdf bib
The IOIT English ASR system for IWSLT 2015
Van Huy Nguyen | Quoc Bao Nguyen | Tat Thang Vu | Chi Mai Luong
Proceedings of the 12th International Workshop on Spoken Language Translation: Evaluation Campaign

2013

pdf bib
The 2013 KIT IWSLT speech-to-text systems for German and English
Kevin Kilgour | Christian Mohr | Michael Heck | Quoc Bao Nguyen | Van Huy Nguyen | Evgeniy Shin | Igor Tseyzer | Jonas Gehring | Markus Müller | Matthias Sperber | Sebastian Stüker | Alex Waibel
Proceedings of the 10th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper describes our English Speech-to-Text (STT) systems for the 2013 IWSLT TED ASR track. The systems consist of multiple subsystems that are combinations of different front-ends, e.g. MVDR-MFCC based and lMel based ones, GMM and NN acoustic models and different phone sets. The outputs of the subsystems are combined via confusion network combination. Decoding is done in two stages, where the systems of the second stage are adapted in an unsupervised manner on the combination of the first stage outputs using VTLN, MLLR, and cMLLR.