The LMU Munich System for the WMT20 Very Low Resource Supervised MT Task

Jindřich Libovický, Viktor Hangya, Helmut Schmid, Alexander Fraser


Abstract
We present our systems for the WMT20 Very Low Resource MT Task for translation between German and Upper Sorbian. For training our systems, we generate synthetic data by both back- and forward-translation. Additionally, we enrich the training data with German-Czech translated from Czech to Upper Sorbian by an unsupervised statistical MT system incorporating orthographically similar word pairs and transliterations of OOV words. Our best translation system between German and Sorbian is based on transfer learning from a Czech-German system and scores 12 to 13 BLEU higher than a baseline system built using the available parallel data only.
Anthology ID:
2020.wmt-1.131
Volume:
Proceedings of the Fifth Conference on Machine Translation
Month:
November
Year:
2020
Address:
Online
Editors:
Loïc Barrault, Ondřej Bojar, Fethi Bougares, Rajen Chatterjee, Marta R. Costa-jussà, Christian Federmann, Mark Fishel, Alexander Fraser, Yvette Graham, Paco Guzman, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, André Martins, Makoto Morishita, Christof Monz, Masaaki Nagata, Toshiaki Nakazawa, Matteo Negri
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
1104–1111
Language:
URL:
https://aclanthology.org/2020.wmt-1.131
DOI:
Bibkey:
Cite (ACL):
Jindřich Libovický, Viktor Hangya, Helmut Schmid, and Alexander Fraser. 2020. The LMU Munich System for the WMT20 Very Low Resource Supervised MT Task. In Proceedings of the Fifth Conference on Machine Translation, pages 1104–1111, Online. Association for Computational Linguistics.
Cite (Informal):
The LMU Munich System for the WMT20 Very Low Resource Supervised MT Task (Libovický et al., WMT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.wmt-1.131.pdf
Video:
 https://slideslive.com/38939564