UdS-DFKI@WMT20: Unsupervised MT and Very Low Resource Supervised MT for German-Upper Sorbian
Sourav Dutta, Jesujoba Alabi, Saptarashmi Bandyopadhyay, Dana Ruiter, Josef van Genabith
Abstract
This paper describes the UdS-DFKI submission to the shared task for unsupervised machine translation (MT) and very low-resource supervised MT between German (de) and Upper Sorbian (hsb) at the Fifth Conference of Machine Translation (WMT20). We submit systems for both the supervised and unsupervised tracks. Apart from various experimental approaches like bitext mining, model pre-training, and iterative back-translation, we employ a factored machine translation approach on a small BPE vocabulary.- Anthology ID:
- 2020.wmt-1.129
- Volume:
- Proceedings of the Fifth Conference on Machine Translation
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Editors:
- Loïc Barrault, Ondřej Bojar, Fethi Bougares, Rajen Chatterjee, Marta R. Costa-jussà, Christian Federmann, Mark Fishel, Alexander Fraser, Yvette Graham, Paco Guzman, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, André Martins, Makoto Morishita, Christof Monz, Masaaki Nagata, Toshiaki Nakazawa, Matteo Negri
- Venue:
- WMT
- SIG:
- SIGMT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1092–1098
- Language:
- URL:
- https://aclanthology.org/2020.wmt-1.129
- DOI:
- Bibkey:
- Cite (ACL):
- Sourav Dutta, Jesujoba Alabi, Saptarashmi Bandyopadhyay, Dana Ruiter, and Josef van Genabith. 2020. UdS-DFKI@WMT20: Unsupervised MT and Very Low Resource Supervised MT for German-Upper Sorbian. In Proceedings of the Fifth Conference on Machine Translation, pages 1092–1098, Online. Association for Computational Linguistics.
- Cite (Informal):
- UdS-DFKI@WMT20: Unsupervised MT and Very Low Resource Supervised MT for German-Upper Sorbian (Dutta et al., WMT 2020)
- Copy Citation:
- PDF:
- https://aclanthology.org/2020.wmt-1.129.pdf
- Video:
- https://slideslive.com/38939584
Export citation
@inproceedings{dutta-etal-2020-uds, title = "{U}d{S}-{DFKI}@{WMT}20: Unsupervised {MT} and Very Low Resource Supervised {MT} for {G}erman-{U}pper {S}orbian", author = "Dutta, Sourav and Alabi, Jesujoba and Bandyopadhyay, Saptarashmi and Ruiter, Dana and van Genabith, Josef", editor = {Barrault, Lo{\"\i}c and Bojar, Ond{\v{r}}ej and Bougares, Fethi and Chatterjee, Rajen and Costa-juss{\`a}, Marta R. and Federmann, Christian and Fishel, Mark and Fraser, Alexander and Graham, Yvette and Guzman, Paco and Haddow, Barry and Huck, Matthias and Yepes, Antonio Jimeno and Koehn, Philipp and Martins, Andr{\'e} and Morishita, Makoto and Monz, Christof and Nagata, Masaaki and Nakazawa, Toshiaki and Negri, Matteo}, booktitle = "Proceedings of the Fifth Conference on Machine Translation", month = nov, year = "2020", address = "Online", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2020.wmt-1.129", pages = "1092--1098", abstract = "This paper describes the UdS-DFKI submission to the shared task for unsupervised machine translation (MT) and very low-resource supervised MT between German (de) and Upper Sorbian (hsb) at the Fifth Conference of Machine Translation (WMT20). We submit systems for both the supervised and unsupervised tracks. Apart from various experimental approaches like bitext mining, model pre-training, and iterative back-translation, we employ a factored machine translation approach on a small BPE vocabulary.", }
<?xml version="1.0" encoding="UTF-8"?> <modsCollection xmlns="http://www.loc.gov/mods/v3"> <mods ID="dutta-etal-2020-uds"> <titleInfo> <title>UdS-DFKI@WMT20: Unsupervised MT and Very Low Resource Supervised MT for German-Upper Sorbian</title> </titleInfo> <name type="personal"> <namePart type="given">Sourav</namePart> <namePart type="family">Dutta</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Jesujoba</namePart> <namePart type="family">Alabi</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Saptarashmi</namePart> <namePart type="family">Bandyopadhyay</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Dana</namePart> <namePart type="family">Ruiter</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Josef</namePart> <namePart type="family">van Genabith</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <originInfo> <dateIssued>2020-11</dateIssued> </originInfo> <typeOfResource>text</typeOfResource> <relatedItem type="host"> <titleInfo> <title>Proceedings of the Fifth Conference on Machine Translation</title> </titleInfo> <name type="personal"> <namePart type="given">Loïc</namePart> <namePart type="family">Barrault</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Ondřej</namePart> <namePart type="family">Bojar</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Fethi</namePart> <namePart type="family">Bougares</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Rajen</namePart> <namePart type="family">Chatterjee</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Marta</namePart> <namePart type="given">R</namePart> <namePart type="family">Costa-jussà</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Christian</namePart> <namePart type="family">Federmann</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Mark</namePart> <namePart type="family">Fishel</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Alexander</namePart> <namePart type="family">Fraser</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Yvette</namePart> <namePart type="family">Graham</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Paco</namePart> <namePart type="family">Guzman</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Barry</namePart> <namePart type="family">Haddow</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Matthias</namePart> <namePart type="family">Huck</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Antonio</namePart> <namePart type="given">Jimeno</namePart> <namePart type="family">Yepes</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Philipp</namePart> <namePart type="family">Koehn</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">André</namePart> <namePart type="family">Martins</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Makoto</namePart> <namePart type="family">Morishita</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Christof</namePart> <namePart type="family">Monz</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Masaaki</namePart> <namePart type="family">Nagata</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Toshiaki</namePart> <namePart type="family">Nakazawa</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Matteo</namePart> <namePart type="family">Negri</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <originInfo> <publisher>Association for Computational Linguistics</publisher> <place> <placeTerm type="text">Online</placeTerm> </place> </originInfo> <genre authority="marcgt">conference publication</genre> </relatedItem> <abstract>This paper describes the UdS-DFKI submission to the shared task for unsupervised machine translation (MT) and very low-resource supervised MT between German (de) and Upper Sorbian (hsb) at the Fifth Conference of Machine Translation (WMT20). We submit systems for both the supervised and unsupervised tracks. Apart from various experimental approaches like bitext mining, model pre-training, and iterative back-translation, we employ a factored machine translation approach on a small BPE vocabulary.</abstract> <identifier type="citekey">dutta-etal-2020-uds</identifier> <location> <url>https://aclanthology.org/2020.wmt-1.129</url> </location> <part> <date>2020-11</date> <extent unit="page"> <start>1092</start> <end>1098</end> </extent> </part> </mods> </modsCollection>
%0 Conference Proceedings %T UdS-DFKI@WMT20: Unsupervised MT and Very Low Resource Supervised MT for German-Upper Sorbian %A Dutta, Sourav %A Alabi, Jesujoba %A Bandyopadhyay, Saptarashmi %A Ruiter, Dana %A van Genabith, Josef %Y Barrault, Loïc %Y Bojar, Ondřej %Y Bougares, Fethi %Y Chatterjee, Rajen %Y Costa-jussà, Marta R. %Y Federmann, Christian %Y Fishel, Mark %Y Fraser, Alexander %Y Graham, Yvette %Y Guzman, Paco %Y Haddow, Barry %Y Huck, Matthias %Y Yepes, Antonio Jimeno %Y Koehn, Philipp %Y Martins, André %Y Morishita, Makoto %Y Monz, Christof %Y Nagata, Masaaki %Y Nakazawa, Toshiaki %Y Negri, Matteo %S Proceedings of the Fifth Conference on Machine Translation %D 2020 %8 November %I Association for Computational Linguistics %C Online %F dutta-etal-2020-uds %X This paper describes the UdS-DFKI submission to the shared task for unsupervised machine translation (MT) and very low-resource supervised MT between German (de) and Upper Sorbian (hsb) at the Fifth Conference of Machine Translation (WMT20). We submit systems for both the supervised and unsupervised tracks. Apart from various experimental approaches like bitext mining, model pre-training, and iterative back-translation, we employ a factored machine translation approach on a small BPE vocabulary. %U https://aclanthology.org/2020.wmt-1.129 %P 1092-1098
Markdown (Informal)
[UdS-DFKI@WMT20: Unsupervised MT and Very Low Resource Supervised MT for German-Upper Sorbian](https://aclanthology.org/2020.wmt-1.129) (Dutta et al., WMT 2020)
- UdS-DFKI@WMT20: Unsupervised MT and Very Low Resource Supervised MT for German-Upper Sorbian (Dutta et al., WMT 2020)
ACL
- Sourav Dutta, Jesujoba Alabi, Saptarashmi Bandyopadhyay, Dana Ruiter, and Josef van Genabith. 2020. UdS-DFKI@WMT20: Unsupervised MT and Very Low Resource Supervised MT for German-Upper Sorbian. In Proceedings of the Fifth Conference on Machine Translation, pages 1092–1098, Online. Association for Computational Linguistics.