eTranslation’s Submissions to the WMT22 General Machine Translation Task
Csaba Oravecz, Katina Bontcheva, David Kolovratnìk, Bogomil Kovachev, Christopher Scott
Correct Metadata for
Abstract
The paper describes the NMT models for French-German, English-Ukranian and English-Russian, submitted by the eTranslation team to the WMT22 general machine translation shared task. In the WMT news task last year, multilingual systems with deep and complex architectures utilizing immense amount of data and resources were dominant. This year with the task extended to cover less domain specific text we expected even more dominance of such systems. In the hope to produce competitive (constrained) systems despite our limited resources, this time we selected only medium resource language pairs, which are serviced in the European Commission’s eTranslation system. We took the approach of exploring less resource intensive strategies focusing on data selection and filtering to improve the performance of baseline systems. With our submitted systems our approach scored competitively according to the automatic rankings, except for the the English–Russian model where our submission was only a baseline reference model developed as a by-product of the multilingual setup we built focusing primarily on the English-Ukranian language pair.- Anthology ID:
- 2022.wmt-1.29
- Volume:
- Proceedings of the Seventh Conference on Machine Translation (WMT)
- Month:
- December
- Year:
- 2022
- Address:
- Abu Dhabi, United Arab Emirates (Hybrid)
- Editors:
- Philipp Koehn, Loïc Barrault, Ondřej Bojar, Fethi Bougares, Rajen Chatterjee, Marta R. Costa-jussà, Christian Federmann, Mark Fishel, Alexander Fraser, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Paco Guzman, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Tom Kocmi, André Martins, Makoto Morishita, Christof Monz, Masaaki Nagata, Toshiaki Nakazawa, Matteo Negri, Aurélie Névéol, Mariana Neves, Martin Popel, Marco Turchi, Marcos Zampieri
- Venue:
- WMT
- SIG:
- SIGMT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 346–351
- Language:
- URL:
- https://aclanthology.org/2022.wmt-1.29/
- DOI:
- Bibkey:
- Cite (ACL):
- Csaba Oravecz, Katina Bontcheva, David Kolovratnìk, Bogomil Kovachev, and Christopher Scott. 2022. eTranslation’s Submissions to the WMT22 General Machine Translation Task. In Proceedings of the Seventh Conference on Machine Translation (WMT), pages 346–351, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
- Cite (Informal):
- eTranslation’s Submissions to the WMT22 General Machine Translation Task (Oravecz et al., WMT 2022)
- Copy Citation:
- PDF:
- https://aclanthology.org/2022.wmt-1.29.pdf
Export citation
@inproceedings{oravecz-etal-2022-etranslations, title = "e{T}ranslation`s Submissions to the {WMT}22 General Machine Translation Task", author = "Oravecz, Csaba and Bontcheva, Katina and Kolovratn{\`i}k, David and Kovachev, Bogomil and Scott, Christopher", editor = {Koehn, Philipp and Barrault, Lo{\"i}c and Bojar, Ond{\v{r}}ej and Bougares, Fethi and Chatterjee, Rajen and Costa-juss{\`a}, Marta R. and Federmann, Christian and Fishel, Mark and Fraser, Alexander and Freitag, Markus and Graham, Yvette and Grundkiewicz, Roman and Guzman, Paco and Haddow, Barry and Huck, Matthias and Jimeno Yepes, Antonio and Kocmi, Tom and Martins, Andr{\'e} and Morishita, Makoto and Monz, Christof and Nagata, Masaaki and Nakazawa, Toshiaki and Negri, Matteo and N{\'e}v{\'e}ol, Aur{\'e}lie and Neves, Mariana and Popel, Martin and Turchi, Marco and Zampieri, Marcos}, booktitle = "Proceedings of the Seventh Conference on Machine Translation (WMT)", month = dec, year = "2022", address = "Abu Dhabi, United Arab Emirates (Hybrid)", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2022.wmt-1.29/", pages = "346--351", abstract = "The paper describes the NMT models for French-German, English-Ukranian and English-Russian, submitted by the eTranslation team to the WMT22 general machine translation shared task. In the WMT news task last year, multilingual systems with deep and complex architectures utilizing immense amount of data and resources were dominant. This year with the task extended to cover less domain specific text we expected even more dominance of such systems. In the hope to produce competitive (constrained) systems despite our limited resources, this time we selected only medium resource language pairs, which are serviced in the European Commission`s eTranslation system. We took the approach of exploring less resource intensive strategies focusing on data selection and filtering to improve the performance of baseline systems. With our submitted systems our approach scored competitively according to the automatic rankings, except for the the English{--}Russian model where our submission was only a baseline reference model developed as a by-product of the multilingual setup we built focusing primarily on the English-Ukranian language pair." }
<?xml version="1.0" encoding="UTF-8"?> <modsCollection xmlns="http://www.loc.gov/mods/v3"> <mods ID="oravecz-etal-2022-etranslations"> <titleInfo> <title>eTranslation‘s Submissions to the WMT22 General Machine Translation Task</title> </titleInfo> <name type="personal"> <namePart type="given">Csaba</namePart> <namePart type="family">Oravecz</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Katina</namePart> <namePart type="family">Bontcheva</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">David</namePart> <namePart type="family">Kolovratnìk</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Bogomil</namePart> <namePart type="family">Kovachev</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Christopher</namePart> <namePart type="family">Scott</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <originInfo> <dateIssued>2022-12</dateIssued> </originInfo> <typeOfResource>text</typeOfResource> <relatedItem type="host"> <titleInfo> <title>Proceedings of the Seventh Conference on Machine Translation (WMT)</title> </titleInfo> <name type="personal"> <namePart type="given">Philipp</namePart> <namePart type="family">Koehn</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Loïc</namePart> <namePart type="family">Barrault</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Ondřej</namePart> <namePart type="family">Bojar</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Fethi</namePart> <namePart type="family">Bougares</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Rajen</namePart> <namePart type="family">Chatterjee</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Marta</namePart> <namePart type="given">R</namePart> <namePart type="family">Costa-jussà</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Christian</namePart> <namePart type="family">Federmann</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Mark</namePart> <namePart type="family">Fishel</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Alexander</namePart> <namePart type="family">Fraser</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Markus</namePart> <namePart type="family">Freitag</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Yvette</namePart> <namePart type="family">Graham</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Roman</namePart> <namePart type="family">Grundkiewicz</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Paco</namePart> <namePart type="family">Guzman</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Barry</namePart> <namePart type="family">Haddow</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Matthias</namePart> <namePart type="family">Huck</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Antonio</namePart> <namePart type="family">Jimeno Yepes</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Tom</namePart> <namePart type="family">Kocmi</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">André</namePart> <namePart type="family">Martins</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Makoto</namePart> <namePart type="family">Morishita</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Christof</namePart> <namePart type="family">Monz</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Masaaki</namePart> <namePart type="family">Nagata</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Toshiaki</namePart> <namePart type="family">Nakazawa</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Matteo</namePart> <namePart type="family">Negri</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Aurélie</namePart> <namePart type="family">Névéol</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Mariana</namePart> <namePart type="family">Neves</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Martin</namePart> <namePart type="family">Popel</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Marco</namePart> <namePart type="family">Turchi</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Marcos</namePart> <namePart type="family">Zampieri</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <originInfo> <publisher>Association for Computational Linguistics</publisher> <place> <placeTerm type="text">Abu Dhabi, United Arab Emirates (Hybrid)</placeTerm> </place> </originInfo> <genre authority="marcgt">conference publication</genre> </relatedItem> <abstract>The paper describes the NMT models for French-German, English-Ukranian and English-Russian, submitted by the eTranslation team to the WMT22 general machine translation shared task. In the WMT news task last year, multilingual systems with deep and complex architectures utilizing immense amount of data and resources were dominant. This year with the task extended to cover less domain specific text we expected even more dominance of such systems. In the hope to produce competitive (constrained) systems despite our limited resources, this time we selected only medium resource language pairs, which are serviced in the European Commission‘s eTranslation system. We took the approach of exploring less resource intensive strategies focusing on data selection and filtering to improve the performance of baseline systems. With our submitted systems our approach scored competitively according to the automatic rankings, except for the the English–Russian model where our submission was only a baseline reference model developed as a by-product of the multilingual setup we built focusing primarily on the English-Ukranian language pair.</abstract> <identifier type="citekey">oravecz-etal-2022-etranslations</identifier> <location> <url>https://aclanthology.org/2022.wmt-1.29/</url> </location> <part> <date>2022-12</date> <extent unit="page"> <start>346</start> <end>351</end> </extent> </part> </mods> </modsCollection>
%0 Conference Proceedings %T eTranslation‘s Submissions to the WMT22 General Machine Translation Task %A Oravecz, Csaba %A Bontcheva, Katina %A Kolovratnìk, David %A Kovachev, Bogomil %A Scott, Christopher %Y Koehn, Philipp %Y Barrault, Loïc %Y Bojar, Ondřej %Y Bougares, Fethi %Y Chatterjee, Rajen %Y Costa-jussà, Marta R. %Y Federmann, Christian %Y Fishel, Mark %Y Fraser, Alexander %Y Freitag, Markus %Y Graham, Yvette %Y Grundkiewicz, Roman %Y Guzman, Paco %Y Haddow, Barry %Y Huck, Matthias %Y Jimeno Yepes, Antonio %Y Kocmi, Tom %Y Martins, André %Y Morishita, Makoto %Y Monz, Christof %Y Nagata, Masaaki %Y Nakazawa, Toshiaki %Y Negri, Matteo %Y Névéol, Aurélie %Y Neves, Mariana %Y Popel, Martin %Y Turchi, Marco %Y Zampieri, Marcos %S Proceedings of the Seventh Conference on Machine Translation (WMT) %D 2022 %8 December %I Association for Computational Linguistics %C Abu Dhabi, United Arab Emirates (Hybrid) %F oravecz-etal-2022-etranslations %X The paper describes the NMT models for French-German, English-Ukranian and English-Russian, submitted by the eTranslation team to the WMT22 general machine translation shared task. In the WMT news task last year, multilingual systems with deep and complex architectures utilizing immense amount of data and resources were dominant. This year with the task extended to cover less domain specific text we expected even more dominance of such systems. In the hope to produce competitive (constrained) systems despite our limited resources, this time we selected only medium resource language pairs, which are serviced in the European Commission‘s eTranslation system. We took the approach of exploring less resource intensive strategies focusing on data selection and filtering to improve the performance of baseline systems. With our submitted systems our approach scored competitively according to the automatic rankings, except for the the English–Russian model where our submission was only a baseline reference model developed as a by-product of the multilingual setup we built focusing primarily on the English-Ukranian language pair. %U https://aclanthology.org/2022.wmt-1.29/ %P 346-351
Markdown (Informal)
[eTranslation’s Submissions to the WMT22 General Machine Translation Task](https://aclanthology.org/2022.wmt-1.29/) (Oravecz et al., WMT 2022)
- eTranslation’s Submissions to the WMT22 General Machine Translation Task (Oravecz et al., WMT 2022)
ACL
- Csaba Oravecz, Katina Bontcheva, David Kolovratnìk, Bogomil Kovachev, and Christopher Scott. 2022. eTranslation’s Submissions to the WMT22 General Machine Translation Task. In Proceedings of the Seventh Conference on Machine Translation (WMT), pages 346–351, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.