The ADAPT System Description for the IWSLT 2018 Basque to English Translation Task

Alberto Poncelas, Andy Way, Kepa Sarasola


Abstract
In this paper we present the ADAPT system built for the Basque to English Low Resource MT Evaluation Campaign. Basque is a low-resourced, morphologically-rich language. This poses a challenge for Neural Machine Translation models which usually achieve better performance when trained with large sets of data. Accordingly, we used synthetic data to improve the translation quality produced by a model built using only authentic data. Our proposal uses back-translated data to: (a) create new sentences, so the system can be trained with more data; and (b) translate sentences that are close to the test set, so the model can be fine-tuned to the document to be translated.
Anthology ID:
2018.iwslt-1.11
Volume:
Proceedings of the 15th International Conference on Spoken Language Translation
Month:
October 29-30
Year:
2018
Address:
Brussels
Editors:
Marco Turchi, Jan Niehues, Marcello Frederico
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
International Conference on Spoken Language Translation
Note:
Pages:
76–82
Language:
URL:
https://aclanthology.org/2018.iwslt-1.11
DOI:
Bibkey:
Cite (ACL):
Alberto Poncelas, Andy Way, and Kepa Sarasola. 2018. The ADAPT System Description for the IWSLT 2018 Basque to English Translation Task. In Proceedings of the 15th International Conference on Spoken Language Translation, pages 76–82, Brussels. International Conference on Spoken Language Translation.
Cite (Informal):
The ADAPT System Description for the IWSLT 2018 Basque to English Translation Task (Poncelas et al., IWSLT 2018)
Copy Citation:
PDF:
https://aclanthology.org/2018.iwslt-1.11.pdf
Data
WMT 2016