The JHU/KyotoU Speech Translation System for IWSLT 2018

Hirofumi Inaguma, Xuan Zhang, Zhiqi Wang, Adithya Renduchintala, Shinji Watanabe, Kevin Duh


Abstract
This paper describes the Johns Hopkins University (JHU) and Kyoto University submissions to the Speech Translation evaluation campaign at IWSLT2018. Our end-to-end speech translation systems are based on ESPnet and implements an attention-based encoder-decoder model. As comparison, we also experiment with a pipeline system that uses independent neural network systems for both the speech transcription and text translation components. We find that a transfer learning approach that bootstraps the end-to-end speech translation system with speech transcription system’s parameters is important for training on small datasets.
Anthology ID:
2018.iwslt-1.23
Volume:
Proceedings of the 15th International Conference on Spoken Language Translation
Month:
October 29-30
Year:
2018
Address:
Brussels
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
International Conference on Spoken Language Translation
Note:
Pages:
153–159
Language:
URL:
https://aclanthology.org/2018.iwslt-1.23
DOI:
Bibkey:
Cite (ACL):
Hirofumi Inaguma, Xuan Zhang, Zhiqi Wang, Adithya Renduchintala, Shinji Watanabe, and Kevin Duh. 2018. The JHU/KyotoU Speech Translation System for IWSLT 2018. In Proceedings of the 15th International Conference on Spoken Language Translation, pages 153–159, Brussels. International Conference on Spoken Language Translation.
Cite (Informal):
The JHU/KyotoU Speech Translation System for IWSLT 2018 (Inaguma et al., IWSLT 2018)
Copy Citation:
PDF:
https://aclanthology.org/2018.iwslt-1.23.pdf