Zero-shot translation among Indian languages

Rudali Huidrom, Yves Lepage


Abstract
Standard neural machine translation (NMT) allows a model to perform translation between a pair of languages. Multilingual neural machine translation (NMT), on the other hand, allows a model to perform translation between several language pairs, even between language pairs for which no sentences pair has been seen during training (zero-shot translation). This paper presents experiments with zero-shot translation on low resource Indian languages with a very small amount of data for each language pair. We first report results on balanced data over all considered language pairs. We then expand our experiments for additional three rounds by increasing the training data with 2,000 sentence pairs in each round for some of the language pairs. We obtain an increase in translation accuracy with its balanced data settings score multiplied by 7 for Manipuri to Hindi during Round-III of zero-shot translation.
Anthology ID:
2020.loresmt-1.7
Volume:
Proceedings of the 3rd Workshop on Technologies for MT of Low Resource Languages
Month:
December
Year:
2020
Address:
Suzhou, China
Editors:
Alina Karakanta, Atul Kr. Ojha, Chao-Hong Liu, Jade Abbott, John Ortega, Jonathan Washington, Nathaniel Oco, Surafel Melaku Lakew, Tommi A Pirinen, Valentin Malykh, Varvara Logacheva, Xiaobing Zhao
Venue:
LoResMT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
47–54
Language:
URL:
https://aclanthology.org/2020.loresmt-1.7
DOI:
Bibkey:
Cite (ACL):
Rudali Huidrom and Yves Lepage. 2020. Zero-shot translation among Indian languages. In Proceedings of the 3rd Workshop on Technologies for MT of Low Resource Languages, pages 47–54, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Zero-shot translation among Indian languages (Huidrom & Lepage, LoResMT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.loresmt-1.7.pdf
Data
PMIndia