Unsupervised Neural Machine Translation for English and Manipuri

Salam Michael Singh, Thoudam Doren Singh


Abstract
Availability of bitext dataset has been a key challenge in the conventional machine translation system which requires surplus amount of parallel data. In this work, we devise an unsupervised neural machine translation (UNMT) system consisting of a transformer based shared encoder and language specific decoders using denoising autoencoder and backtranslation with an additional Manipuri side multiple test reference. We report our work on low resource setting for English (en) - Manipuri (mni) language pair and attain a BLEU score of 3.1 for en-mni and 2.7 for mni-en respectively. Subjective evaluation on translated output gives encouraging findings.
Anthology ID:
2020.loresmt-1.10
Volume:
Proceedings of the 3rd Workshop on Technologies for MT of Low Resource Languages
Month:
December
Year:
2020
Address:
Suzhou, China
Editors:
Alina Karakanta, Atul Kr. Ojha, Chao-Hong Liu, Jade Abbott, John Ortega, Jonathan Washington, Nathaniel Oco, Surafel Melaku Lakew, Tommi A Pirinen, Valentin Malykh, Varvara Logacheva, Xiaobing Zhao
Venue:
LoResMT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
69–78
Language:
URL:
https://aclanthology.org/2020.loresmt-1.10
DOI:
Bibkey:
Cite (ACL):
Salam Michael Singh and Thoudam Doren Singh. 2020. Unsupervised Neural Machine Translation for English and Manipuri. In Proceedings of the 3rd Workshop on Technologies for MT of Low Resource Languages, pages 69–78, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Unsupervised Neural Machine Translation for English and Manipuri (Singh & Singh, LoResMT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.loresmt-1.10.pdf