F. Brugnara


2012

pdf bib
FBK@IWSLT 2012 – ASR track
D. Falavigna | R. Gretter | F. Brugnara | D. Giuliani
Proceedings of the 9th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper reports on the participation of FBK at the IWSLT2012 evaluation campaign on automatic speech recognition: namely in the English ASR track. Both primary and contrastive submissions have been sent for evaluation. The ASR system features acoustic models trained on a portion of the TED talk recordings that was automatically selected according to the fidelity of the provided transcriptions. Three decoding steps are performed interleaved by acoustic feature normalization and acoustic model adaptation. A final rescoring step, based on the usage of an interpolated language model, is applied to word graphs generated in the third decoding step. For the primary submission, language models entering the interpolation are trained on both out-of-domain and in-domain text data, instead the contrastive submission uses both ”general purpose” and auxiliary language models trained only on out-of-domain text data. Despite this fact, similar performance are obtained with the two submissions.

2011

pdf bib
FBK@IWSLT 2011
N. Ruiz | A. Bisazza | F. Brugnara | D. Falavigna | D. Giuliani | S. Jaber | R. Gretter | M. Federico
Proceedings of the 8th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper reports on the participation of FBK at the IWSLT 2011 Evaluation: namely in the English ASR track, the Arabic-English MT track and the English-French MT and SLT tracks. Our ASR system features acoustic models trained on a portion of the TED talk recordings that was automatically selected according to the fidelity of the provided transcriptions. Three decoding steps are performed interleaved by acoustic feature normalization and acoustic model adaptation. Concerning the MT and SLT systems, besides language specific pre-processing and the automatic introduction of punctuation in the ASR output, two major improvements are reported over our last year baselines. First, we applied a fill-up method for phrase-table adaptation; second, we explored the use of hybrid class-based language models to better capture the language style of public speeches.