Efficient Unsupervised NMT for Related Languages with Cross-Lingual Language Models and Fidelity Objectives

Rami Aly, Andrew Caines, Paula Buttery


Abstract
The most successful approach to Neural Machine Translation (NMT) when only monolingual training data is available, called unsupervised machine translation, is based on back-translation where noisy translations are generated to turn the task into a supervised one. However, back-translation is computationally very expensive and inefficient. This work explores a novel, efficient approach to unsupervised NMT. A transformer, initialized with cross-lingual language model weights, is fine-tuned exclusively on monolingual data of the target language by jointly learning on a paraphrasing and denoising autoencoder objective. Experiments are conducted on WMT datasets for German-English, French-English, and Romanian-English. Results are competitive to strong baseline unsupervised NMT models, especially for closely related source languages (German) compared to more distant ones (Romanian, French), while requiring about a magnitude less training time.
Anthology ID:
2021.vardial-1.6
Volume:
Proceedings of the Eighth Workshop on NLP for Similar Languages, Varieties and Dialects
Month:
April
Year:
2021
Address:
Kiyv, Ukraine
Venues:
EACL | VarDial
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
49–59
Language:
URL:
https://aclanthology.org/2021.vardial-1.6
DOI:
Bibkey:
Cite (ACL):
Rami Aly, Andrew Caines, and Paula Buttery. 2021. Efficient Unsupervised NMT for Related Languages with Cross-Lingual Language Models and Fidelity Objectives. In Proceedings of the Eighth Workshop on NLP for Similar Languages, Varieties and Dialects, pages 49–59, Kiyv, Ukraine. Association for Computational Linguistics.
Cite (Informal):
Efficient Unsupervised NMT for Related Languages with Cross-Lingual Language Models and Fidelity Objectives (Aly et al., VarDial 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.vardial-1.6.pdf
Data
WMT 2015