HW-TSC’s Participation in the WAT 2020 Indic Languages Multilingual Task

Zhengzhe Yu, Zhanglin Wu, Xiaoyu Chen, Daimeng Wei, Hengchao Shang, Jiaxin Guo, Zongyao Li, Minghan Wang, Liangyou Li, Lizhi Lei, Hao Yang, Ying Qin


Abstract
This paper describes our work in the WAT 2020 Indic Multilingual Translation Task. We participated in all 7 language pairs (En<->Bn/Hi/Gu/Ml/Mr/Ta/Te) in both directions under the constrained condition—using only the officially provided data. Using transformer as a baseline, our Multi->En and En->Multi translation systems achieve the best performances. Detailed data filtering and data domain selection are the keys to performance enhancement in our experiment, with an average improvement of 2.6 BLEU scores for each language pair in the En->Multi system and an average improvement of 4.6 BLEU scores regarding the Multi->En. In addition, we employed language independent adapter to further improve the system performances. Our submission obtains competitive results in the final evaluation.
Anthology ID:
2020.wat-1.8
Volume:
Proceedings of the 7th Workshop on Asian Translation
Month:
December
Year:
2020
Address:
Suzhou, China
Venues:
AACL | WAT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
92–97
Language:
URL:
https://aclanthology.org/2020.wat-1.8
DOI:
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2020.wat-1.8.pdf