VUS at IWSLT 2021: A Finetuned Pipeline for Offline Speech Translation

Yong Rae Jo, Youngki Moon, Minji Jung, Jungyoon Choi, Jihyung Moon, Won Ik Cho


Abstract
In this technical report, we describe the fine-tuned ASR-MT pipeline used for the IWSLT shared task. We remove less useful speech samples by checking WER with an ASR model, and further train a wav2vec and Transformers-based ASR module based on the filtered data. In addition, we cleanse the errata that can interfere with the machine translation process and use it for Transformer-based MT module training. Finally, in the actual inference phase, we use a sentence boundary detection model trained with constrained data to properly merge fragment ASR outputs into full sentences. The merged sentences are post-processed using part of speech. The final result is yielded by the trained MT module. The performance using the dev set displays BLEU 20.37, and this model records the performance of BLEU 20.9 with the test set.
Anthology ID:
2021.iwslt-1.12
Volume:
Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021)
Month:
August
Year:
2021
Address:
Bangkok, Thailand (online)
Editors:
Marcello Federico, Alex Waibel, Marta R. Costa-jussà, Jan Niehues, Sebastian Stuker, Elizabeth Salesky
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Association for Computational Linguistics
Note:
Pages:
120–124
Language:
URL:
https://aclanthology.org/2021.iwslt-1.12
DOI:
10.18653/v1/2021.iwslt-1.12
Bibkey:
Cite (ACL):
Yong Rae Jo, Youngki Moon, Minji Jung, Jungyoon Choi, Jihyung Moon, and Won Ik Cho. 2021. VUS at IWSLT 2021: A Finetuned Pipeline for Offline Speech Translation. In Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021), pages 120–124, Bangkok, Thailand (online). Association for Computational Linguistics.
Cite (Informal):
VUS at IWSLT 2021: A Finetuned Pipeline for Offline Speech Translation (Jo et al., IWSLT 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.iwslt-1.12.pdf
Data
LibriSpeechMuST-C