Tilde at WMT 2020: News Task Systems

Rihards Krišlauks, Mārcis Pinnis


Abstract
This paper describes Tilde’s submission to the WMT2020 shared task on news translation for both directions of the English-Polish language pair in both the constrained and the unconstrained tracks. We follow our submissions form the previous years and build our baseline systems to be morphologically motivated sub-word unit-based Transformer base models that we train using the Marian machine translation toolkit. Additionally, we experiment with different parallel and monolingual data selection schemes, as well as sampled back-translation. Our final models are ensembles of Transformer base and Transformer big models which feature right-to-left re-ranking.
Anthology ID:
2020.wmt-1.15
Volume:
Proceedings of the Fifth Conference on Machine Translation
Month:
November
Year:
2020
Address:
Online
Venues:
EMNLP | WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
175–180
Language:
URL:
https://aclanthology.org/2020.wmt-1.15
DOI:
Bibkey:
Cite (ACL):
Rihards Krišlauks and Mārcis Pinnis. 2020. Tilde at WMT 2020: News Task Systems. In Proceedings of the Fifth Conference on Machine Translation, pages 175–180, Online. Association for Computational Linguistics.
Cite (Informal):
Tilde at WMT 2020: News Task Systems (Krišlauks & Pinnis, WMT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.wmt-1.15.pdf
Video:
 https://slideslive.com/38939633
Data
ParaCrawl