The LMU Munich System for the WMT 2020 Unsupervised Machine Translation Shared Task

Alexandra Chronopoulou, Dario Stojanovski, Viktor Hangya, Alexander Fraser


Abstract
This paper describes the submission of LMU Munich to the WMT 2020 unsupervised shared task, in two language directions, German↔Upper Sorbian. Our core unsupervised neural machine translation (UNMT) system follows the strategy of Chronopoulou et al. (2020), using a monolingual pretrained language generation model (on German) and fine-tuning it on both German and Upper Sorbian, before initializing a UNMT model, which is trained with online backtranslation. Pseudo-parallel data obtained from an unsupervised statistical machine translation (USMT) system is used to fine-tune the UNMT model. We also apply BPE-Dropout to the low resource (Upper Sorbian) data to obtain a more robust system. We additionally experiment with residual adapters and find them useful in the Upper Sorbian→German direction. We explore sampling during backtranslation and curriculum learning to use SMT translations in a more principled way. Finally, we ensemble our best-performing systems and reach a BLEU score of 32.4 on German→Upper Sorbian and 35.2 on Upper Sorbian→German.
Anthology ID:
2020.wmt-1.128
Volume:
Proceedings of the Fifth Conference on Machine Translation
Month:
November
Year:
2020
Address:
Online
Editors:
Loïc Barrault, Ondřej Bojar, Fethi Bougares, Rajen Chatterjee, Marta R. Costa-jussà, Christian Federmann, Mark Fishel, Alexander Fraser, Yvette Graham, Paco Guzman, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, André Martins, Makoto Morishita, Christof Monz, Masaaki Nagata, Toshiaki Nakazawa, Matteo Negri
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
1084–1091
Language:
URL:
https://aclanthology.org/2020.wmt-1.128
DOI:
Bibkey:
Cite (ACL):
Alexandra Chronopoulou, Dario Stojanovski, Viktor Hangya, and Alexander Fraser. 2020. The LMU Munich System for the WMT 2020 Unsupervised Machine Translation Shared Task. In Proceedings of the Fifth Conference on Machine Translation, pages 1084–1091, Online. Association for Computational Linguistics.
Cite (Informal):
The LMU Munich System for the WMT 2020 Unsupervised Machine Translation Shared Task (Chronopoulou et al., WMT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.wmt-1.128.pdf
Video:
 https://slideslive.com/38939582
Code
 alexandra-chron/umt-lmu-wmt2020