NRC-CNRC Systems for Upper Sorbian-German and Lower Sorbian-German Machine Translation 2021

Rebecca Knowles, Samuel Larkin


Abstract
We describe our neural machine translation systems for the 2021 shared task on Unsupervised and Very Low Resource Supervised MT, translating between Upper Sorbian and German (low-resource) and between Lower Sorbian and German (unsupervised). The systems incorporated data filtering, backtranslation, BPE-dropout, ensembling, and transfer learning from high(er)-resource languages. As measured by automatic metrics, our systems showed strong performance, consistently placing first or tied for first across most metrics and translation directions.
Anthology ID:
2021.wmt-1.107
Volume:
Proceedings of the Sixth Conference on Machine Translation
Month:
November
Year:
2021
Address:
Online
Editors:
Loic Barrault, Ondrej Bojar, Fethi Bougares, Rajen Chatterjee, Marta R. Costa-jussa, Christian Federmann, Mark Fishel, Alexander Fraser, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Paco Guzman, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Tom Kocmi, Andre Martins, Makoto Morishita, Christof Monz
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
999–1008
Language:
URL:
https://aclanthology.org/2021.wmt-1.107
DOI:
Bibkey:
Cite (ACL):
Rebecca Knowles and Samuel Larkin. 2021. NRC-CNRC Systems for Upper Sorbian-German and Lower Sorbian-German Machine Translation 2021. In Proceedings of the Sixth Conference on Machine Translation, pages 999–1008, Online. Association for Computational Linguistics.
Cite (Informal):
NRC-CNRC Systems for Upper Sorbian-German and Lower Sorbian-German Machine Translation 2021 (Knowles & Larkin, WMT 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.wmt-1.107.pdf