HW-TSC’s Participation in the WMT 2021 Large-Scale Multilingual Translation Task

Zhengzhe Yu, Daimeng Wei, Zongyao Li, Hengchao Shang, Xiaoyu Chen, Zhanglin Wu, Jiaxin Guo, Minghan Wang, Lizhi Lei, Min Zhang, Hao Yang, Ying Qin


Abstract
This paper presents the submission of Huawei Translation Services Center (HW-TSC) to the WMT 2021 Large-Scale Multilingual Translation Task. We participate in Samll Track #2, including 6 languages: Javanese (Jv), Indonesian (Id), Malay (Ms), Tagalog (Tl), Tamil (Ta) and English (En) with 30 directions under the constrained condition. We use Transformer architecture and obtain the best performance via multiple variants with larger parameter sizes. We train a single multilingual model to translate all the 30 directions. We perform detailed pre-processing and filtering on the provided large-scale bilingual and monolingual datasets. Several commonly used strategies are used to train our models, such as Back Translation, Forward Translation, Ensemble Knowledge Distillation, Adapter Fine-tuning. Our model obtains competitive results in the end.
Anthology ID:
2021.wmt-1.55
Volume:
Proceedings of the Sixth Conference on Machine Translation
Month:
November
Year:
2021
Address:
Online
Venues:
EMNLP | WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
456–463
Language:
URL:
https://aclanthology.org/2021.wmt-1.55
DOI:
Bibkey:
Cite (ACL):
Zhengzhe Yu, Daimeng Wei, Zongyao Li, Hengchao Shang, Xiaoyu Chen, Zhanglin Wu, Jiaxin Guo, Minghan Wang, Lizhi Lei, Min Zhang, Hao Yang, and Ying Qin. 2021. HW-TSC’s Participation in the WMT 2021 Large-Scale Multilingual Translation Task. In Proceedings of the Sixth Conference on Machine Translation, pages 456–463, Online. Association for Computational Linguistics.
Cite (Informal):
HW-TSC’s Participation in the WMT 2021 Large-Scale Multilingual Translation Task (Yu et al., WMT 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.wmt-1.55.pdf