FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared Task

Yun Tang; Hongyu Gong; Xian Li; Changhan Wang; Juan Pino; Holger Schwenk; Naman Goyal

doi:10.18653/v1/2021.iwslt-1.14

FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared Task

Yun Tang, Hongyu Gong, Xian Li, Changhan Wang, Juan Pino, Holger Schwenk, Naman Goyal

Abstract

In this paper, we describe our end-to-end multilingual speech translation system submitted to the IWSLT 2021 evaluation campaign on the Multilingual Speech Translation shared task. Our system is built by leveraging transfer learning across modalities, tasks and languages. First, we leverage general-purpose multilingual modules pretrained with large amounts of unlabelled and labelled data. We further enable knowledge transfer from the text task to the speech task by training two tasks jointly. Finally, our multilingual model is finetuned on speech translation task-specific data to achieve the best translation results. Experimental results show our system outperforms the reported systems, including both end-to-end and cascaded based approaches, by a large margin. In some translation directions, our speech translation results evaluated on the public Multilingual TEDx test set are even comparable with the ones from a strong text-to-text translation system, which uses the oracle speech transcripts as input.

Anthology ID:: 2021.iwslt-1.14
Volume:: Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021)
Month:: August
Year:: 2021
Address:: Bangkok, Thailand (online)
Editors:: Marcello Federico, Alex Waibel, Marta R. Costa-jussà, Jan Niehues, Sebastian Stuker, Elizabeth Salesky
Venue:: IWSLT
SIG:: SIGSLT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 131–137
Language:
URL:: https://aclanthology.org/2021.iwslt-1.14/
DOI:: 10.18653/v1/2021.iwslt-1.14
Bibkey:
Cite (ACL):: Yun Tang, Hongyu Gong, Xian Li, Changhan Wang, Juan Pino, Holger Schwenk, and Naman Goyal. 2021. FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared Task. In Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021), pages 131–137, Bangkok, Thailand (online). Association for Computational Linguistics.
Cite (Informal):: FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared Task (Tang et al., IWSLT 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.iwslt-1.14.pdf

PDF Cite Search Fix data