NICT’s Supervised Neural Machine Translation Systems for the WMT19 News Translation Task
Raj Dabre, Kehai Chen, Benjamin Marie, Rui Wang, Atsushi Fujita, Masao Utiyama, Eiichiro Sumita
Correct Metadata for
Abstract
In this paper, we describe our supervised neural machine translation (NMT) systems that we developed for the news translation task for Kazakh↔English, Gujarati↔English, Chinese↔English, and English→Finnish translation directions. We focused on leveraging multilingual transfer learning and back-translation for the extremely low-resource language pairs: Kazakh↔English and Gujarati↔English translation. For the Chinese↔English translation, we used the provided parallel data augmented with a large quantity of back-translated monolingual data to train state-of-the-art NMT systems. We then employed techniques that have been proven to be most effective, such as back-translation, fine-tuning, and model ensembling, to generate the primary submissions of Chinese↔English. For English→Finnish, our submission from WMT18 remains a strong baseline despite the increase in parallel corpora for this year’s task.- Anthology ID:
- W19-5313
- Volume:
- Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)
- Month:
- August
- Year:
- 2019
- Address:
- Florence, Italy
- Editors:
- Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, André Martins, Christof Monz, Matteo Negri, Aurélie Névéol, Mariana Neves, Matt Post, Marco Turchi, Karin Verspoor
- Venue:
- WMT
- SIG:
- SIGMT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 168–174
- Language:
- URL:
- https://aclanthology.org/W19-5313/
- DOI:
- 10.18653/v1/W19-5313
- Bibkey:
- Cite (ACL):
- Raj Dabre, Kehai Chen, Benjamin Marie, Rui Wang, Atsushi Fujita, Masao Utiyama, and Eiichiro Sumita. 2019. NICT’s Supervised Neural Machine Translation Systems for the WMT19 News Translation Task. In Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), pages 168–174, Florence, Italy. Association for Computational Linguistics.
- Cite (Informal):
- NICT’s Supervised Neural Machine Translation Systems for the WMT19 News Translation Task (Dabre et al., WMT 2019)
- Copy Citation:
- PDF:
- https://aclanthology.org/W19-5313.pdf
Export citation
@inproceedings{dabre-etal-2019-nicts, title = "{NICT}`s Supervised Neural Machine Translation Systems for the {WMT}19 News Translation Task", author = "Dabre, Raj and Chen, Kehai and Marie, Benjamin and Wang, Rui and Fujita, Atsushi and Utiyama, Masao and Sumita, Eiichiro", editor = "Bojar, Ond{\v{r}}ej and Chatterjee, Rajen and Federmann, Christian and Fishel, Mark and Graham, Yvette and Haddow, Barry and Huck, Matthias and Yepes, Antonio Jimeno and Koehn, Philipp and Martins, Andr{\'e} and Monz, Christof and Negri, Matteo and N{\'e}v{\'e}ol, Aur{\'e}lie and Neves, Mariana and Post, Matt and Turchi, Marco and Verspoor, Karin", booktitle = "Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)", month = aug, year = "2019", address = "Florence, Italy", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/W19-5313/", doi = "10.18653/v1/W19-5313", pages = "168--174", abstract = "In this paper, we describe our supervised neural machine translation (NMT) systems that we developed for the news translation task for Kazakh{\ensuremath{\leftrightarrow}}English, Gujarati{\ensuremath{\leftrightarrow}}English, Chinese{\ensuremath{\leftrightarrow}}English, and English{\textrightarrow}Finnish translation directions. We focused on leveraging multilingual transfer learning and back-translation for the extremely low-resource language pairs: Kazakh{\ensuremath{\leftrightarrow}}English and Gujarati{\ensuremath{\leftrightarrow}}English translation. For the Chinese{\ensuremath{\leftrightarrow}}English translation, we used the provided parallel data augmented with a large quantity of back-translated monolingual data to train state-of-the-art NMT systems. We then employed techniques that have been proven to be most effective, such as back-translation, fine-tuning, and model ensembling, to generate the primary submissions of Chinese{\ensuremath{\leftrightarrow}}English. For English{\textrightarrow}Finnish, our submission from WMT18 remains a strong baseline despite the increase in parallel corpora for this year`s task." }
<?xml version="1.0" encoding="UTF-8"?> <modsCollection xmlns="http://www.loc.gov/mods/v3"> <mods ID="dabre-etal-2019-nicts"> <titleInfo> <title>NICT‘s Supervised Neural Machine Translation Systems for the WMT19 News Translation Task</title> </titleInfo> <name type="personal"> <namePart type="given">Raj</namePart> <namePart type="family">Dabre</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Kehai</namePart> <namePart type="family">Chen</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Benjamin</namePart> <namePart type="family">Marie</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Rui</namePart> <namePart type="family">Wang</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Atsushi</namePart> <namePart type="family">Fujita</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Masao</namePart> <namePart type="family">Utiyama</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Eiichiro</namePart> <namePart type="family">Sumita</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <originInfo> <dateIssued>2019-08</dateIssued> </originInfo> <typeOfResource>text</typeOfResource> <relatedItem type="host"> <titleInfo> <title>Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)</title> </titleInfo> <name type="personal"> <namePart type="given">Ondřej</namePart> <namePart type="family">Bojar</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Rajen</namePart> <namePart type="family">Chatterjee</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Christian</namePart> <namePart type="family">Federmann</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Mark</namePart> <namePart type="family">Fishel</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Yvette</namePart> <namePart type="family">Graham</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Barry</namePart> <namePart type="family">Haddow</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Matthias</namePart> <namePart type="family">Huck</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Antonio</namePart> <namePart type="given">Jimeno</namePart> <namePart type="family">Yepes</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Philipp</namePart> <namePart type="family">Koehn</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">André</namePart> <namePart type="family">Martins</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Christof</namePart> <namePart type="family">Monz</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Matteo</namePart> <namePart type="family">Negri</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Aurélie</namePart> <namePart type="family">Névéol</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Mariana</namePart> <namePart type="family">Neves</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Matt</namePart> <namePart type="family">Post</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Marco</namePart> <namePart type="family">Turchi</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Karin</namePart> <namePart type="family">Verspoor</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <originInfo> <publisher>Association for Computational Linguistics</publisher> <place> <placeTerm type="text">Florence, Italy</placeTerm> </place> </originInfo> <genre authority="marcgt">conference publication</genre> </relatedItem> <abstract>In this paper, we describe our supervised neural machine translation (NMT) systems that we developed for the news translation task for Kazakh\ensuremathłeftrightarrowEnglish, Gujarati\ensuremathłeftrightarrowEnglish, Chinese\ensuremathłeftrightarrowEnglish, and English→Finnish translation directions. We focused on leveraging multilingual transfer learning and back-translation for the extremely low-resource language pairs: Kazakh\ensuremathłeftrightarrowEnglish and Gujarati\ensuremathłeftrightarrowEnglish translation. For the Chinese\ensuremathłeftrightarrowEnglish translation, we used the provided parallel data augmented with a large quantity of back-translated monolingual data to train state-of-the-art NMT systems. We then employed techniques that have been proven to be most effective, such as back-translation, fine-tuning, and model ensembling, to generate the primary submissions of Chinese\ensuremathłeftrightarrowEnglish. For English→Finnish, our submission from WMT18 remains a strong baseline despite the increase in parallel corpora for this year‘s task.</abstract> <identifier type="citekey">dabre-etal-2019-nicts</identifier> <identifier type="doi">10.18653/v1/W19-5313</identifier> <location> <url>https://aclanthology.org/W19-5313/</url> </location> <part> <date>2019-08</date> <extent unit="page"> <start>168</start> <end>174</end> </extent> </part> </mods> </modsCollection>
%0 Conference Proceedings %T NICT‘s Supervised Neural Machine Translation Systems for the WMT19 News Translation Task %A Dabre, Raj %A Chen, Kehai %A Marie, Benjamin %A Wang, Rui %A Fujita, Atsushi %A Utiyama, Masao %A Sumita, Eiichiro %Y Bojar, Ondřej %Y Chatterjee, Rajen %Y Federmann, Christian %Y Fishel, Mark %Y Graham, Yvette %Y Haddow, Barry %Y Huck, Matthias %Y Yepes, Antonio Jimeno %Y Koehn, Philipp %Y Martins, André %Y Monz, Christof %Y Negri, Matteo %Y Névéol, Aurélie %Y Neves, Mariana %Y Post, Matt %Y Turchi, Marco %Y Verspoor, Karin %S Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1) %D 2019 %8 August %I Association for Computational Linguistics %C Florence, Italy %F dabre-etal-2019-nicts %X In this paper, we describe our supervised neural machine translation (NMT) systems that we developed for the news translation task for Kazakh\ensuremathłeftrightarrowEnglish, Gujarati\ensuremathłeftrightarrowEnglish, Chinese\ensuremathłeftrightarrowEnglish, and English→Finnish translation directions. We focused on leveraging multilingual transfer learning and back-translation for the extremely low-resource language pairs: Kazakh\ensuremathłeftrightarrowEnglish and Gujarati\ensuremathłeftrightarrowEnglish translation. For the Chinese\ensuremathłeftrightarrowEnglish translation, we used the provided parallel data augmented with a large quantity of back-translated monolingual data to train state-of-the-art NMT systems. We then employed techniques that have been proven to be most effective, such as back-translation, fine-tuning, and model ensembling, to generate the primary submissions of Chinese\ensuremathłeftrightarrowEnglish. For English→Finnish, our submission from WMT18 remains a strong baseline despite the increase in parallel corpora for this year‘s task. %R 10.18653/v1/W19-5313 %U https://aclanthology.org/W19-5313/ %U https://doi.org/10.18653/v1/W19-5313 %P 168-174
Markdown (Informal)
[NICT’s Supervised Neural Machine Translation Systems for the WMT19 News Translation Task](https://aclanthology.org/W19-5313/) (Dabre et al., WMT 2019)
- NICT’s Supervised Neural Machine Translation Systems for the WMT19 News Translation Task (Dabre et al., WMT 2019)
ACL
- Raj Dabre, Kehai Chen, Benjamin Marie, Rui Wang, Atsushi Fujita, Masao Utiyama, and Eiichiro Sumita. 2019. NICT’s Supervised Neural Machine Translation Systems for the WMT19 News Translation Task. In Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), pages 168–174, Florence, Italy. Association for Computational Linguistics.