Addressing word-order Divergence in Multilingual Neural Machine Translation for extremely Low Resource Languages

Rudra Murthy, Anoop Kunchukuttan, Pushpak Bhattacharyya


Abstract
Transfer learning approaches for Neural Machine Translation (NMT) train a NMT model on an assisting language-target language pair (parent model) which is later fine-tuned for the source language-target language pair of interest (child model), with the target language being the same. In many cases, the assisting language has a different word order from the source language. We show that divergent word order adversely limits the benefits from transfer learning when little to no parallel corpus between the source and target language is available. To bridge this divergence, we propose to pre-order the assisting language sentences to match the word order of the source language and train the parent model. Our experiments on many language pairs show that bridging the word order gap leads to significant improvement in the translation quality in extremely low-resource scenarios.
Anthology ID:
N19-1387
Volume:
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Editors:
Jill Burstein, Christy Doran, Thamar Solorio
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3868–3873
Language:
URL:
https://aclanthology.org/N19-1387
DOI:
10.18653/v1/N19-1387
Bibkey:
Cite (ACL):
Rudra Murthy, Anoop Kunchukuttan, and Pushpak Bhattacharyya. 2019. Addressing word-order Divergence in Multilingual Neural Machine Translation for extremely Low Resource Languages. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 3868–3873, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):
Addressing word-order Divergence in Multilingual Neural Machine Translation for extremely Low Resource Languages (Murthy et al., NAACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/N19-1387.pdf