ZJU’s IWSLT 2021 Speech Translation System

Linlin Zhang


Abstract
In this paper, we describe Zhejiang University’s submission to the IWSLT2021 Multilingual Speech Translation Task. This task focuses on speech translation (ST) research across many non-English source languages. Participants can decide whether to work on constrained systems or unconstrained systems which can using external data. We create both cascaded and end-to-end speech translation constrained systems, using the provided data only. In the cascaded approach, we combine Conformer-based automatic speech recognition (ASR) with the Transformer-based neural machine translation (NMT). Our end-to-end direct speech translation systems use ASR pretrained encoder and multi-task decoders. The submitted systems are ensembled by different cascaded models.
Anthology ID:
2021.iwslt-1.16
Volume:
Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021)
Month:
August
Year:
2021
Address:
Bangkok, Thailand (online)
Editors:
Marcello Federico, Alex Waibel, Marta R. Costa-jussà, Jan Niehues, Sebastian Stuker, Elizabeth Salesky
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Association for Computational Linguistics
Note:
Pages:
144–148
Language:
URL:
https://aclanthology.org/2021.iwslt-1.16
DOI:
10.18653/v1/2021.iwslt-1.16
Bibkey:
Cite (ACL):
Linlin Zhang. 2021. ZJU’s IWSLT 2021 Speech Translation System. In Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021), pages 144–148, Bangkok, Thailand (online). Association for Computational Linguistics.
Cite (Informal):
ZJU’s IWSLT 2021 Speech Translation System (Zhang, IWSLT 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.iwslt-1.16.pdf