University of Cape Town’s WMT22 System: Multilingual Machine Translation for Southern African Languages

Khalid Elmadani, Francois Meyer, Jan Buys


Abstract
The paper describes the University of Cape Town’s submission to the constrained track of the WMT22 Shared Task: Large-Scale Machine Translation Evaluation for African Languages. Our system is a single multilingual translation model that translates between English and 8 South / South East African Languages, as well as between specific pairs of the African languages. We used several techniques suited for low-resource machine translation (MT), including overlap BPE, back-translation, synthetic training data generation, and adding more translation directions during training. Our results show the value of these techniques, especially for directions where very little or no bilingual training data is available.
Anthology ID:
2022.wmt-1.101
Volume:
Proceedings of the Seventh Conference on Machine Translation (WMT)
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates (Hybrid)
Editors:
Philipp Koehn, Loïc Barrault, Ondřej Bojar, Fethi Bougares, Rajen Chatterjee, Marta R. Costa-jussà, Christian Federmann, Mark Fishel, Alexander Fraser, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Paco Guzman, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Tom Kocmi, André Martins, Makoto Morishita, Christof Monz, Masaaki Nagata, Toshiaki Nakazawa, Matteo Negri, Aurélie Névéol, Mariana Neves, Martin Popel, Marco Turchi, Marcos Zampieri
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
1039–1048
Language:
URL:
https://aclanthology.org/2022.wmt-1.101
DOI:
Bibkey:
Cite (ACL):
Khalid Elmadani, Francois Meyer, and Jan Buys. 2022. University of Cape Town’s WMT22 System: Multilingual Machine Translation for Southern African Languages. In Proceedings of the Seventh Conference on Machine Translation (WMT), pages 1039–1048, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
Cite (Informal):
University of Cape Town’s WMT22 System: Multilingual Machine Translation for Southern African Languages (Elmadani et al., WMT 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.wmt-1.101.pdf