Findings of the WMT 2020 Shared Tasks in Unsupervised MT and Very Low Resource Supervised MT

Alexander Fraser


Abstract
We describe the WMT 2020 Shared Tasks in Unsupervised MT and Very Low Resource Supervised MT. In both tasks, the community studied German to Upper Sorbian and Upper Sorbian to German MT, which is a very realistic machine translation scenario (unlike the simulated scenarios used in particular in much of the unsupervised MT work in the past). We were able to obtain most of the digital data available for Upper Sorbian, a minority language of Germany, which was the original motivation for the Unsupervised MT shared task. As we were defining the task, we also obtained a small amount of parallel data (about 60000 parallel sentences), allowing us to offer a Very Low Resource Supervised MT task as well. Six primary systems participated in the unsupervised shared task, two of these systems used additional data beyond the data released by the organizers. Ten primary systems participated in the very low resource supervised task. The paper discusses the background, presents the tasks and results, and discusses best practices for the future.
Anthology ID:
2020.wmt-1.80
Volume:
Proceedings of the Fifth Conference on Machine Translation
Month:
November
Year:
2020
Address:
Online
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
765–771
Language:
URL:
https://aclanthology.org/2020.wmt-1.80
DOI:
Bibkey:
Cite (ACL):
Alexander Fraser. 2020. Findings of the WMT 2020 Shared Tasks in Unsupervised MT and Very Low Resource Supervised MT. In Proceedings of the Fifth Conference on Machine Translation, pages 765–771, Online. Association for Computational Linguistics.
Cite (Informal):
Findings of the WMT 2020 Shared Tasks in Unsupervised MT and Very Low Resource Supervised MT (Fraser, WMT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.wmt-1.80.pdf
Video:
 https://slideslive.com/38939667