The NICT ASR system for IWSLT 2013

Chien-Lin Huang, Paul R. Dixon, Shigeki Matsuda, Youzheng Wu, Xugang Lu, Masahiro Saiko, Chiori Hori


Abstract
This study presents the NICT automatic speech recognition (ASR) system submitted for the IWSLT 2013 ASR evaluation. We apply two types of acoustic features and three types of acoustic models to the NICT ASR system. Our system is comprised of six subsystems with different acoustic features and models. This study reports the individual results and fusion of systems and highlights the improvements made by our proposed methods that include the automatic segmentation of audio data, language model adaptation, speaker adaptive training of deep neural network models, and the NICT SprinTra decoder. Our experimental results indicated that our proposed methods offer good performance improvements on lecture speech recognition tasks. Our results denoted a 13.5% word error rate on the IWSLT 2013 ASR English test data set.
Anthology ID:
2013.iwslt-evaluation.6
Volume:
Proceedings of the 10th International Workshop on Spoken Language Translation: Evaluation Campaign
Month:
December 5-6
Year:
2013
Address:
Heidelberg, Germany
Editor:
Joy Ying Zhang
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Note:
Pages:
Language:
URL:
https://aclanthology.org/2013.iwslt-evaluation.6
DOI:
Bibkey:
Cite (ACL):
Chien-Lin Huang, Paul R. Dixon, Shigeki Matsuda, Youzheng Wu, Xugang Lu, Masahiro Saiko, and Chiori Hori. 2013. The NICT ASR system for IWSLT 2013. In Proceedings of the 10th International Workshop on Spoken Language Translation: Evaluation Campaign, Heidelberg, Germany.
Cite (Informal):
The NICT ASR system for IWSLT 2013 (Huang et al., IWSLT 2013)
Copy Citation:
PDF:
https://aclanthology.org/2013.iwslt-evaluation.6.pdf