Wang Minghan
2023
HW-TSC at IWSLT2023: Break the Quality Ceiling of Offline Track via Pre-Training and Domain Adaptation
Zongyao Li
|
Zhanglin Wu
|
Zhiqiang Rao
|
Xie YuHao
|
Guo JiaXin
|
Daimeng Wei
|
Hengchao Shang
|
Wang Minghan
|
Xiaoyu Chen
|
Zhengzhe Yu
|
Li ShaoJun
|
Lei LiZhi
|
Hao Yang
Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023)
This paper presents HW-TSC’s submissions to the IWSLT 2023 Offline Speech Translation task, including speech translation of talks from English to German, Chinese, and Japanese, respectively. We participate in all three conditions (constrained training, constrained with large language models training, and unconstrained training) with models of cascaded architectures. We use data enhancement, pre-training models and other means to improve the ASR quality, and R-Drop, deep model, domain data selection, etc. to improve the translation quality. Compared with last year’s best results, we achieve 2.1 BLEU improvement on the MuST-C English-German test set.
2021
HW-TSC’s Participation at WMT 2021 Quality Estimation Shared Task
Yimeng Chen
|
Chang Su
|
Yingtao Zhang
|
Yuxia Wang
|
Xiang Geng
|
Hao Yang
|
Shimin Tao
|
Guo Jiaxin
|
Wang Minghan
|
Min Zhang
|
Yujia Liu
|
Shujian Huang
Proceedings of the Sixth Conference on Machine Translation
This paper presents our work in WMT 2021 Quality Estimation (QE) Shared Task. We participated in all of the three sub-tasks, including Sentence-Level Direct Assessment (DA) task, Word and Sentence-Level Post-editing Effort task and Critical Error Detection task, in all language pairs. Our systems employ the framework of Predictor-Estimator, concretely with a pre-trained XLM-Roberta as Predictor and task-specific classifier or regressor as Estimator. For all tasks, we improve our systems by incorporating post-edit sentence or additional high-quality translation sentence in the way of multitask learning or encoding it with predictors directly. Moreover, in zero-shot setting, our data augmentation strategy based on Monte-Carlo Dropout brings up significant improvement on DA sub-task. Notably, our submissions achieve remarkable results over all tasks.
Search
Co-authors
- Hao Yang 2
- Guo Jiaxin 2
- Yimeng Chen 1
- Chang Su 1
- Yingtao Zhang 1
- show all...