HW-TSC’s Submissions to the WMT22 Word-Level Auto Completion Task

Hao Yang; Hengchao Shang; Zongyao Li; Daimeng Wei; Xianghui He; Xiaoyu Chen; Zhengzhe Yu; Jiaxin Guo; Jinlong Yang; Shaojun Li; Yuanchang Luo; Yuhao Xie; Lizhi Lei; Ying Qin

doi:10.18653/v1/2022.wmt-1.122

HW-TSC’s Submissions to the WMT22 Word-Level Auto Completion Task

Hao Yang, Hengchao Shang, Zongyao Li, Daimeng Wei, Xianghui He, Xiaoyu Chen, Zhengzhe Yu, Jiaxin Guo, Jinlong Yang, Shaojun Li, Yuanchang Luo, Yuhao Xie, Lizhi Lei, Ying Qin

Abstract

This paper presents the submissions of Huawei Translation Services Center (HW-TSC) to WMT 2022 Word-Level AutoCompletion Task. We propose an end-to-end autoregressive model with bi-context based on Transformer to solve current task. The model uses a mixture of subword and character encoding units to realize the joint encoding of human input, the context of the target side and the decoded sequence, which ensures full utilization of information. We uses one model to solve four types of data structures in the task. During training, we try using a machine translation model as the pre-trained model and fine-tune it for the task. We also add BERT-style MLM data at the fine-tuning stage to improve model performance. We participate in zh→en, en→de, and de→en directions and win the first place in all the three tracks. Particularly, we outperform the second place by more than 5% in terms of accuracy on the zh→en and en→de tracks. The result is buttressed by human evaluations as well, demonstrating the effectiveness of our model.

Anthology ID:: 2022.wmt-1.122
Volume:: Proceedings of the Seventh Conference on Machine Translation (WMT)
Month:: December
Year:: 2022
Address:: Abu Dhabi, United Arab Emirates (Hybrid)
Editors:: Philipp Koehn, Loïc Barrault, Ondřej Bojar, Fethi Bougares, Rajen Chatterjee, Marta R. Costa-jussà, Christian Federmann, Mark Fishel, Alexander Fraser, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Paco Guzman, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Tom Kocmi, André Martins, Makoto Morishita, Christof Monz, Masaaki Nagata, Toshiaki Nakazawa, Matteo Negri, Aurélie Névéol, Mariana Neves, Martin Popel, Marco Turchi, Marcos Zampieri
Venue:: WMT
SIG:: SIGMT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1192–1197
Language:
URL:: https://aclanthology.org/2022.wmt-1.122/
DOI:: 10.18653/v1/2022.wmt-1.122
Bibkey:
Cite (ACL):: Hao Yang, Hengchao Shang, Zongyao Li, Daimeng Wei, Xianghui He, Xiaoyu Chen, Zhengzhe Yu, Jiaxin Guo, Jinlong Yang, Shaojun Li, Yuanchang Luo, Yuhao Xie, Lizhi Lei, and Ying Qin. 2022. HW-TSC’s Submissions to the WMT22 Word-Level Auto Completion Task. In Proceedings of the Seventh Conference on Machine Translation (WMT), pages 1192–1197, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
Cite (Informal):: HW-TSC’s Submissions to the WMT22 Word-Level Auto Completion Task (Yang et al., WMT 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.wmt-1.122.pdf

PDF Cite Search Fix data