Neural Machine Translation for Similar Languages: The Case of Indo-Aryan Languages

Santanu Pal, Marcos Zampieri


Abstract
In this paper we present the WIPRO-RIT systems submitted to the Similar Language Translation shared task at WMT 2020. The second edition of this shared task featured parallel data from pairs/groups of similar languages from three different language families: Indo-Aryan languages (Hindi and Marathi), Romance languages (Catalan, Portuguese, and Spanish), and South Slavic Languages (Croatian, Serbian, and Slovene). We report the results obtained by our systems in translating from Hindi to Marathi and from Marathi to Hindi. WIPRO-RIT achieved competitive performance ranking 1st in Marathi to Hindi and 2nd in Hindi to Marathi translation among 22 systems.
Anthology ID:
2020.wmt-1.50
Volume:
Proceedings of the Fifth Conference on Machine Translation
Month:
November
Year:
2020
Address:
Online
Editors:
Loïc Barrault, Ondřej Bojar, Fethi Bougares, Rajen Chatterjee, Marta R. Costa-jussà, Christian Federmann, Mark Fishel, Alexander Fraser, Yvette Graham, Paco Guzman, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, André Martins, Makoto Morishita, Christof Monz, Masaaki Nagata, Toshiaki Nakazawa, Matteo Negri
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
424–429
Language:
URL:
https://aclanthology.org/2020.wmt-1.50
DOI:
Bibkey:
Cite (ACL):
Santanu Pal and Marcos Zampieri. 2020. Neural Machine Translation for Similar Languages: The Case of Indo-Aryan Languages. In Proceedings of the Fifth Conference on Machine Translation, pages 424–429, Online. Association for Computational Linguistics.
Cite (Informal):
Neural Machine Translation for Similar Languages: The Case of Indo-Aryan Languages (Pal & Zampieri, WMT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.wmt-1.50.pdf
Data
PMIndiaWMT 2014