The TALP-UPC Machine Translation Systems for WMT19 News Translation Task: Pivoting Techniques for Low Resource MT
Noe Casas, José A. R. Fonollosa, Carlos Escolano, Christine Basta, Marta R. Costa-jussà
Correct Metadata for
Abstract
In this article, we describe the TALP-UPC research group participation in the WMT19 news translation shared task for Kazakh-English. Given the low amount of parallel training data, we resort to using Russian as pivot language, training subword-based statistical translation systems for Russian-Kazakh and Russian-English that were then used to create two synthetic pseudo-parallel corpora for Kazakh-English and English-Kazakh respectively. Finally, a self-attention model based on the decoder part of the Transformer architecture was trained on the two pseudo-parallel corpora.- Anthology ID:
- W19-5311
- Volume:
- Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)
- Month:
- August
- Year:
- 2019
- Address:
- Florence, Italy
- Editors:
- Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, André Martins, Christof Monz, Matteo Negri, Aurélie Névéol, Mariana Neves, Matt Post, Marco Turchi, Karin Verspoor
- Venue:
- WMT
- SIG:
- SIGMT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 155–162
- Language:
- URL:
- https://aclanthology.org/W19-5311/
- DOI:
- 10.18653/v1/W19-5311
- Bibkey:
- Cite (ACL):
- Noe Casas, José A. R. Fonollosa, Carlos Escolano, Christine Basta, and Marta R. Costa-jussà. 2019. The TALP-UPC Machine Translation Systems for WMT19 News Translation Task: Pivoting Techniques for Low Resource MT. In Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), pages 155–162, Florence, Italy. Association for Computational Linguistics.
- Cite (Informal):
- The TALP-UPC Machine Translation Systems for WMT19 News Translation Task: Pivoting Techniques for Low Resource MT (Casas et al., WMT 2019)
- Copy Citation:
- PDF:
- https://aclanthology.org/W19-5311.pdf
- Poster:
- W19-5311.Poster.pdf
Export citation
@inproceedings{casas-etal-2019-talp, title = "The {TALP}-{UPC} Machine Translation Systems for {WMT}19 News Translation Task: Pivoting Techniques for Low Resource {MT}", author = "Casas, Noe and Fonollosa, Jos{\'e} A. R. and Escolano, Carlos and Basta, Christine and Costa-juss{\`a}, Marta R.", editor = "Bojar, Ond{\v{r}}ej and Chatterjee, Rajen and Federmann, Christian and Fishel, Mark and Graham, Yvette and Haddow, Barry and Huck, Matthias and Yepes, Antonio Jimeno and Koehn, Philipp and Martins, Andr{\'e} and Monz, Christof and Negri, Matteo and N{\'e}v{\'e}ol, Aur{\'e}lie and Neves, Mariana and Post, Matt and Turchi, Marco and Verspoor, Karin", booktitle = "Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)", month = aug, year = "2019", address = "Florence, Italy", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/W19-5311/", doi = "10.18653/v1/W19-5311", pages = "155--162", abstract = "In this article, we describe the TALP-UPC research group participation in the WMT19 news translation shared task for Kazakh-English. Given the low amount of parallel training data, we resort to using Russian as pivot language, training subword-based statistical translation systems for Russian-Kazakh and Russian-English that were then used to create two synthetic pseudo-parallel corpora for Kazakh-English and English-Kazakh respectively. Finally, a self-attention model based on the decoder part of the Transformer architecture was trained on the two pseudo-parallel corpora." }
<?xml version="1.0" encoding="UTF-8"?> <modsCollection xmlns="http://www.loc.gov/mods/v3"> <mods ID="casas-etal-2019-talp"> <titleInfo> <title>The TALP-UPC Machine Translation Systems for WMT19 News Translation Task: Pivoting Techniques for Low Resource MT</title> </titleInfo> <name type="personal"> <namePart type="given">Noe</namePart> <namePart type="family">Casas</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">José</namePart> <namePart type="given">A</namePart> <namePart type="given">R</namePart> <namePart type="family">Fonollosa</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Carlos</namePart> <namePart type="family">Escolano</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Christine</namePart> <namePart type="family">Basta</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Marta</namePart> <namePart type="given">R</namePart> <namePart type="family">Costa-jussà</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <originInfo> <dateIssued>2019-08</dateIssued> </originInfo> <typeOfResource>text</typeOfResource> <relatedItem type="host"> <titleInfo> <title>Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)</title> </titleInfo> <name type="personal"> <namePart type="given">Ondřej</namePart> <namePart type="family">Bojar</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Rajen</namePart> <namePart type="family">Chatterjee</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Christian</namePart> <namePart type="family">Federmann</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Mark</namePart> <namePart type="family">Fishel</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Yvette</namePart> <namePart type="family">Graham</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Barry</namePart> <namePart type="family">Haddow</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Matthias</namePart> <namePart type="family">Huck</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Antonio</namePart> <namePart type="given">Jimeno</namePart> <namePart type="family">Yepes</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Philipp</namePart> <namePart type="family">Koehn</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">André</namePart> <namePart type="family">Martins</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Christof</namePart> <namePart type="family">Monz</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Matteo</namePart> <namePart type="family">Negri</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Aurélie</namePart> <namePart type="family">Névéol</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Mariana</namePart> <namePart type="family">Neves</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Matt</namePart> <namePart type="family">Post</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Marco</namePart> <namePart type="family">Turchi</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Karin</namePart> <namePart type="family">Verspoor</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <originInfo> <publisher>Association for Computational Linguistics</publisher> <place> <placeTerm type="text">Florence, Italy</placeTerm> </place> </originInfo> <genre authority="marcgt">conference publication</genre> </relatedItem> <abstract>In this article, we describe the TALP-UPC research group participation in the WMT19 news translation shared task for Kazakh-English. Given the low amount of parallel training data, we resort to using Russian as pivot language, training subword-based statistical translation systems for Russian-Kazakh and Russian-English that were then used to create two synthetic pseudo-parallel corpora for Kazakh-English and English-Kazakh respectively. Finally, a self-attention model based on the decoder part of the Transformer architecture was trained on the two pseudo-parallel corpora.</abstract> <identifier type="citekey">casas-etal-2019-talp</identifier> <identifier type="doi">10.18653/v1/W19-5311</identifier> <location> <url>https://aclanthology.org/W19-5311/</url> </location> <part> <date>2019-08</date> <extent unit="page"> <start>155</start> <end>162</end> </extent> </part> </mods> </modsCollection>
%0 Conference Proceedings %T The TALP-UPC Machine Translation Systems for WMT19 News Translation Task: Pivoting Techniques for Low Resource MT %A Casas, Noe %A Fonollosa, José A. R. %A Escolano, Carlos %A Basta, Christine %A Costa-jussà, Marta R. %Y Bojar, Ondřej %Y Chatterjee, Rajen %Y Federmann, Christian %Y Fishel, Mark %Y Graham, Yvette %Y Haddow, Barry %Y Huck, Matthias %Y Yepes, Antonio Jimeno %Y Koehn, Philipp %Y Martins, André %Y Monz, Christof %Y Negri, Matteo %Y Névéol, Aurélie %Y Neves, Mariana %Y Post, Matt %Y Turchi, Marco %Y Verspoor, Karin %S Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1) %D 2019 %8 August %I Association for Computational Linguistics %C Florence, Italy %F casas-etal-2019-talp %X In this article, we describe the TALP-UPC research group participation in the WMT19 news translation shared task for Kazakh-English. Given the low amount of parallel training data, we resort to using Russian as pivot language, training subword-based statistical translation systems for Russian-Kazakh and Russian-English that were then used to create two synthetic pseudo-parallel corpora for Kazakh-English and English-Kazakh respectively. Finally, a self-attention model based on the decoder part of the Transformer architecture was trained on the two pseudo-parallel corpora. %R 10.18653/v1/W19-5311 %U https://aclanthology.org/W19-5311/ %U https://doi.org/10.18653/v1/W19-5311 %P 155-162
Markdown (Informal)
[The TALP-UPC Machine Translation Systems for WMT19 News Translation Task: Pivoting Techniques for Low Resource MT](https://aclanthology.org/W19-5311/) (Casas et al., WMT 2019)
- The TALP-UPC Machine Translation Systems for WMT19 News Translation Task: Pivoting Techniques for Low Resource MT (Casas et al., WMT 2019)
ACL
- Noe Casas, José A. R. Fonollosa, Carlos Escolano, Christine Basta, and Marta R. Costa-jussà. 2019. The TALP-UPC Machine Translation Systems for WMT19 News Translation Task: Pivoting Techniques for Low Resource MT. In Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), pages 155–162, Florence, Italy. Association for Computational Linguistics.