Machine Translation of Low-Resource Indo-European Languages

Wei-Rui Chen, Muhammad Abdul-Mageed


Abstract
In this work, we investigate methods for the challenging task of translating between low- resource language pairs that exhibit some level of similarity. In particular, we consider the utility of transfer learning for translating between several Indo-European low-resource languages from the Germanic and Romance language families. In particular, we build two main classes of transfer-based systems to study how relatedness can benefit the translation performance. The primary system fine-tunes a model pre-trained on a related language pair and the contrastive system fine-tunes one pre-trained on an unrelated language pair. Our experiments show that although relatedness is not necessary for transfer learning to work, it does benefit model performance.
Anthology ID:
2021.wmt-1.41
Volume:
Proceedings of the Sixth Conference on Machine Translation
Month:
November
Year:
2021
Address:
Online
Editors:
Loic Barrault, Ondrej Bojar, Fethi Bougares, Rajen Chatterjee, Marta R. Costa-jussa, Christian Federmann, Mark Fishel, Alexander Fraser, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Paco Guzman, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Tom Kocmi, Andre Martins, Makoto Morishita, Christof Monz
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
347–353
Language:
URL:
https://aclanthology.org/2021.wmt-1.41
DOI:
Bibkey:
Cite (ACL):
Wei-Rui Chen and Muhammad Abdul-Mageed. 2021. Machine Translation of Low-Resource Indo-European Languages. In Proceedings of the Sixth Conference on Machine Translation, pages 347–353, Online. Association for Computational Linguistics.
Cite (Informal):
Machine Translation of Low-Resource Indo-European Languages (Chen & Abdul-Mageed, WMT 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.wmt-1.41.pdf