IMS’ Systems for the IWSLT 2021 Low-Resource Speech Translation Task

Pavel Denisov, Manuel Mager, Ngoc Thang Vu


Abstract
This paper describes the submission to the IWSLT 2021 Low-Resource Speech Translation Shared Task by IMS team. We utilize state-of-the-art models combined with several data augmentation, multi-task and transfer learning approaches for the automatic speech recognition (ASR) and machine translation (MT) steps of our cascaded system. Moreover, we also explore the feasibility of a full end-to-end speech translation (ST) model in the case of very constrained amount of ground truth labeled data. Our best system achieves the best performance among all submitted systems for Congolese Swahili to English and French with BLEU scores 7.7 and 13.7 respectively, and the second best result for Coastal Swahili to English with BLEU score 14.9.
Anthology ID:
2021.iwslt-1.21
Volume:
Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021)
Month:
August
Year:
2021
Address:
Bangkok, Thailand (online)
Editors:
Marcello Federico, Alex Waibel, Marta R. Costa-jussà, Jan Niehues, Sebastian Stuker, Elizabeth Salesky
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Association for Computational Linguistics
Note:
Pages:
175–181
Language:
URL:
https://aclanthology.org/2021.iwslt-1.21
DOI:
10.18653/v1/2021.iwslt-1.21
Bibkey:
Cite (ACL):
Pavel Denisov, Manuel Mager, and Ngoc Thang Vu. 2021. IMS’ Systems for the IWSLT 2021 Low-Resource Speech Translation Task. In Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021), pages 175–181, Bangkok, Thailand (online). Association for Computational Linguistics.
Cite (Informal):
IMS’ Systems for the IWSLT 2021 Low-Resource Speech Translation Task (Denisov et al., IWSLT 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.iwslt-1.21.pdf
Data
CCAlignedLibriSpeechSPGISpeechWikiMatrix