On Knowledge Distillation for Translating Erroneous Speech Transcriptions

Ryo Fukuda, Katsuhito Sudoh, Satoshi Nakamura


Abstract
Recent studies argue that knowledge distillation is promising for speech translation (ST) using end-to-end models. In this work, we investigate the effect of knowledge distillation with a cascade ST using automatic speech recognition (ASR) and machine translation (MT) models. We distill knowledge from a teacher model based on human transcripts to a student model based on erroneous transcriptions. Our experimental results demonstrated that knowledge distillation is beneficial for a cascade ST. Further investigation that combined knowledge distillation and fine-tuning revealed that the combination consistently improved two language pairs: English-Italian and Spanish-English.
Anthology ID:
2021.iwslt-1.24
Volume:
Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021)
Month:
August
Year:
2021
Address:
Bangkok, Thailand (online)
Editors:
Marcello Federico, Alex Waibel, Marta R. Costa-jussà, Jan Niehues, Sebastian Stuker, Elizabeth Salesky
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Association for Computational Linguistics
Note:
Pages:
198–205
Language:
URL:
https://aclanthology.org/2021.iwslt-1.24
DOI:
10.18653/v1/2021.iwslt-1.24
Bibkey:
Cite (ACL):
Ryo Fukuda, Katsuhito Sudoh, and Satoshi Nakamura. 2021. On Knowledge Distillation for Translating Erroneous Speech Transcriptions. In Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021), pages 198–205, Bangkok, Thailand (online). Association for Computational Linguistics.
Cite (Informal):
On Knowledge Distillation for Translating Erroneous Speech Transcriptions (Fukuda et al., IWSLT 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.iwslt-1.24.pdf
Data
MuST-C