Tilde at WMT 2020: News Task Systems

Rihards Krišlauks, Mārcis Pinnis


Abstract
This paper describes Tilde’s submission to the WMT2020 shared task on news translation for both directions of the English-Polish language pair in both the constrained and the unconstrained tracks. We follow our submissions form the previous years and build our baseline systems to be morphologically motivated sub-word unit-based Transformer base models that we train using the Marian machine translation toolkit. Additionally, we experiment with different parallel and monolingual data selection schemes, as well as sampled back-translation. Our final models are ensembles of Transformer base and Transformer big models which feature right-to-left re-ranking.
Anthology ID:
2020.wmt-1.15
Volume:
Proceedings of the Fifth Conference on Machine Translation
Month:
November
Year:
2020
Address:
Online
Editors:
Loïc Barrault, Ondřej Bojar, Fethi Bougares, Rajen Chatterjee, Marta R. Costa-jussà, Christian Federmann, Mark Fishel, Alexander Fraser, Yvette Graham, Paco Guzman, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, André Martins, Makoto Morishita, Christof Monz, Masaaki Nagata, Toshiaki Nakazawa, Matteo Negri
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
175–180
Language:
URL:
https://aclanthology.org/2020.wmt-1.15
DOI:
Bibkey:
Cite (ACL):
Rihards Krišlauks and Mārcis Pinnis. 2020. Tilde at WMT 2020: News Task Systems. In Proceedings of the Fifth Conference on Machine Translation, pages 175–180, Online. Association for Computational Linguistics.
Cite (Informal):
Tilde at WMT 2020: News Task Systems (Krišlauks & Pinnis, WMT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.wmt-1.15.pdf
Video:
 https://slideslive.com/38939633
Data
ParaCrawl