The Xiaomi Text-to-Text Simultaneous Speech Translation System for IWSLT 2022

Bao Guo, Mengge Liu, Wen Zhang, Hexuan Chen, Chang Mu, Xiang Li, Jianwei Cui, Bin Wang, Yuhang Guo


Abstract
This system paper describes the Xiaomi Translation System for the IWSLT 2022 Simultaneous Speech Translation (noted as SST) shared task. We participate in the English-to-Mandarin Chinese Text-to-Text (noted as T2T) track. Our system is built based on the Transformer model with novel techniques borrowed from our recent research work. For the data filtering, language-model-based and rule-based methods are conducted to filter the data to obtain high-quality bilingual parallel corpora. We also strengthen our system with some dominating techniques related to data augmentation, such as knowledge distillation, tagged back-translation, and iterative back-translation. We also incorporate novel training techniques such as R-drop, deep model, and large batch training which have been shown to be beneficial to the naive Transformer model. In the SST scenario, several variations of extttwait-k strategies are explored. Furthermore, in terms of robustness, both data-based and model-based ways are used to reduce the sensitivity of our system to Automatic Speech Recognition (ASR) outputs. We finally design some inference algorithms and use the adaptive-ensemble method based on multiple model variants to further improve the performance of the system. Compared with strong baselines, fusing all techniques can improve our system by 2 extasciitilde3 BLEU scores under different latency regimes.
Anthology ID:
2022.iwslt-1.17
Volume:
Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT 2022)
Month:
May
Year:
2022
Address:
Dublin, Ireland (in-person and online)
Editors:
Elizabeth Salesky, Marcello Federico, Marta Costa-jussà
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Association for Computational Linguistics
Note:
Pages:
216–224
Language:
URL:
https://aclanthology.org/2022.iwslt-1.17
DOI:
10.18653/v1/2022.iwslt-1.17
Bibkey:
Cite (ACL):
Bao Guo, Mengge Liu, Wen Zhang, Hexuan Chen, Chang Mu, Xiang Li, Jianwei Cui, Bin Wang, and Yuhang Guo. 2022. The Xiaomi Text-to-Text Simultaneous Speech Translation System for IWSLT 2022. In Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT 2022), pages 216–224, Dublin, Ireland (in-person and online). Association for Computational Linguistics.
Cite (Informal):
The Xiaomi Text-to-Text Simultaneous Speech Translation System for IWSLT 2022 (Guo et al., IWSLT 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.iwslt-1.17.pdf