Naver Labs Europe’s Systems for the WMT19 Machine Translation Robustness Task

Alexandre Bérard; Ioan Calapodescu; Claude Roux

doi:10.18653/v1/W19-5361

Naver Labs Europe’s Systems for the WMT19 Machine Translation Robustness Task

Alexandre Berard, Ioan Calapodescu, Claude Roux

Abstract

This paper describes the systems that we submitted to the WMT19 Machine Translation robustness task. This task aims to improve MT’s robustness to noise found on social media, like informal language, spelling mistakes and other orthographic variations. The organizers provide parallel data extracted from a social media website in two language pairs: French-English and Japanese-English (one for each language direction). The goal is to obtain the best scores on unseen test sets from the same source, according to automatic metrics (BLEU) and human evaluation. We propose one single and one ensemble system for each translation direction. Our ensemble models ranked first in all language pairs, according to BLEU evaluation. We discuss the pre-processing choices that we made, and present our solutions for robustness to noise and domain adaptation.

Anthology ID:: W19-5361
Volume:: Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)
Month:: August
Year:: 2019
Address:: Florence, Italy
Editors:: Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, André Martins, Christof Monz, Matteo Negri, Aurélie Névéol, Mariana Neves, Matt Post, Marco Turchi, Karin Verspoor
Venue:: WMT
SIG:: SIGMT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 526–532
Language:
URL:: https://aclanthology.org/W19-5361/
DOI:: 10.18653/v1/W19-5361
Bibkey:
Cite (ACL):: Alexandre Berard, Ioan Calapodescu, and Claude Roux. 2019. Naver Labs Europe’s Systems for the WMT19 Machine Translation Robustness Task. In Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), pages 526–532, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):: Naver Labs Europe’s Systems for the WMT19 Machine Translation Robustness Task (Berard et al., WMT 2019)
Copy Citation:
PDF:: https://aclanthology.org/W19-5361.pdf

PDF Cite Search Fix data