Shushu Wang


2025

pdf bib
Improve Speech Translation Through Text Rewrite
Jing Wu | Shushu Wang | Kai Fan | Wei Luo | Minpeng Liao | Zhongqiang Huang
Proceedings of the 31st International Conference on Computational Linguistics: Industry Track

Despite recent progress in Speech Translation (ST) research, the challenges posed by inherent speech phenomena that distinguish transcribed speech from written text are not well addressed. The informal and erroneous nature of spontaneous speech is inadequately represented in the typical parallel text available for building translation models. We propose to address these issues through a text rewrite approach that aims to transform transcribed speech into a cleaner style more in line with the expectations of translation models built from written text. Moreover, the advantages of the rewrite model can be effectively distilled into a standalone translation model. Experiments on several benchmarks, using both publicly available and in-house translation models, demonstrate that adding a rewrite model to a traditional ST pipeline is a cost-effect way to address a variety of speech irregularities and improve speech translation quality for multiple language directions and domains.

2023

pdf bib
Better Simultaneous Translation with Monotonic Knowledge Distillation
Shushu Wang | Jing Wu | Kai Fan | Wei Luo | Jun Xiao | Zhongqiang Huang
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Simultaneous machine translation (SiMT) presents a unique challenge as it requires generating target tokens before the source sentence is fully consumed. This can lead to the hallucination problem, where target tokens are generated without support from the source sentence. The prefix-to-prefix training data used to train SiMT models are not always parallel, due to divergent word order between the source and target languages, and can contribute to the problem. In this paper, we propose a novel approach that leverages traditional translation models as teachers and employs a two-stage beam search algorithm to generate monotonic yet accurate reference translations for sequence-level knowledge distillation. Experimental results demonstrate the significant improvements achieved by our approach over multiple strong SiMT baselines, leading to new state-of-the-art performance across various language pairs. Notably, when evaluated on a monotonic version of the WMT15 De-En test set, which includes references generated in a more monotonic style by professional translators, our approach achieves even more substantial improvement over the baselines. The source code and data are publicly available for further exploration.

pdf bib
Adaptive Policy with Wait-k Model for Simultaneous Translation
Libo Zhao | Kai Fan | Wei Luo | Wu Jing | Shushu Wang | Ziqian Zeng | Zhongqiang Huang
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Simultaneous machine translation (SiMT) requires a robust read/write policy in conjunction with a high-quality translation model. Traditional methods rely on either a fixed wait-k policy coupled with a standalone wait-k translation model, or an adaptive policy jointly trained with the translation model. In this study, we propose a more flexible approach by decoupling the adaptive policy model from the translation model. Our motivation stems from the observation that a standalone multi-path wait-k model performs competitively with adaptive policies utilized in state-of-the-art SiMT approaches. Specifically, we introduce DaP, a divergence-based adaptive policy, that makes read/write decisions for any translation model based on the potential divergence in translation distributions resulting from future information. DaP extends a frozen wait-k model with lightweight parameters, and is both memory and computation efficient. Experimental results across various benchmarks demonstrate that our approach offers an improved trade-off between translation accuracy and latency, outperforming strong baselines.