GX@DravidianLangTech-EACL2021: Multilingual Neural Machine Translation and Back-translation

Wanying Xie


Abstract
In this paper, we describe the GX system in the EACL2021 shared task on machine translation in Dravidian languages. Given the low amount of parallel training data, We adopt two methods to improve the overall performance: (1) multilingual translation, we use a shared encoder-decoder multilingual translation model handling multiple languages simultaneously to facilitate the translation performance of these languages; (2) back-translation, we collected other open-source parallel and monolingual data and apply back-translation to benefit from the monolingual data. The experimental results show that we can achieve satisfactory translation results in these Dravidian languages and rank first in English-Telugu and Tamil-Telugu translation.
Anthology ID:
2021.dravidianlangtech-1.18
Volume:
Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages
Month:
April
Year:
2021
Address:
Kyiv
Editors:
Bharathi Raja Chakravarthi, Ruba Priyadharshini, Anand Kumar M, Parameswari Krishnamurthy, Elizabeth Sherly
Venue:
DravidianLangTech
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
146–153
Language:
URL:
https://aclanthology.org/2021.dravidianlangtech-1.18
DOI:
Bibkey:
Cite (ACL):
Wanying Xie. 2021. GX@DravidianLangTech-EACL2021: Multilingual Neural Machine Translation and Back-translation. In Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, pages 146–153, Kyiv. Association for Computational Linguistics.
Cite (Informal):
GX@DravidianLangTech-EACL2021: Multilingual Neural Machine Translation and Back-translation (Xie, DravidianLangTech 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.dravidianlangtech-1.18.pdf
Software:
 2021.dravidianlangtech-1.18.Software.zip
Dataset:
 2021.dravidianlangtech-1.18.Dataset.zip