Mengge Liu


2022

pdf bib
The Xiaomi Text-to-Text Simultaneous Speech Translation System for IWSLT 2022
Bao Guo | Mengge Liu | Wen Zhang | Hexuan Chen | Chang Mu | Xiang Li | Jianwei Cui | Bin Wang | Yuhang Guo
Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT 2022)

This system paper describes the Xiaomi Translation System for the IWSLT 2022 Simultaneous Speech Translation (noted as SST) shared task. We participate in the English-to-Mandarin Chinese Text-to-Text (noted as T2T) track. Our system is built based on the Transformer model with novel techniques borrowed from our recent research work. For the data filtering, language-model-based and rule-based methods are conducted to filter the data to obtain high-quality bilingual parallel corpora. We also strengthen our system with some dominating techniques related to data augmentation, such as knowledge distillation, tagged back-translation, and iterative back-translation. We also incorporate novel training techniques such as R-drop, deep model, and large batch training which have been shown to be beneficial to the naive Transformer model. In the SST scenario, several variations of extttwait-k strategies are explored. Furthermore, in terms of robustness, both data-based and model-based ways are used to reduce the sensitivity of our system to Automatic Speech Recognition (ASR) outputs. We finally design some inference algorithms and use the adaptive-ensemble method based on multiple model variants to further improve the performance of the system. Compared with strong baselines, fusing all techniques can improve our system by 2 extasciitilde3 BLEU scores under different latency regimes.

2021

pdf bib
BIT’s system for AutoSimulTrans2021
Mengge Liu | Shuoying Chen | Minqin Li | Zhipeng Wang | Yuhang Guo
Proceedings of the Second Workshop on Automatic Simultaneous Translation

In this paper we introduce our Chinese-English simultaneous translation system participating in AutoSimulTrans2021. In simultaneous translation, translation quality and delay are both important. In order to reduce the translation delay, we cut the streaming-input source sentence into segments and translate the segments before the full sentence is received. In order to obtain high-quality translations, we pre-train a translation model with adequate corpus and fine-tune the model with domain adaptation and sentence length adaptation. The experimental results on the evaluation data show that our system performs better than the baseline system.

2006

pdf bib
A Chinese Automatic Text Summarization system for mobile devices
Lei Yu | Mengge Liu | Fuji Ren | Shingo Kuroiwa
Proceedings of the 20th Pacific Asia Conference on Language, Information and Computation