Héctor Murrieta Bello
2021
Moses and the Character-Based Random Babbling Baseline: CoAStaL at AmericasNLP 2021 Shared Task
Marcel Bollmann
|
Rahul Aralikatte
|
Héctor Murrieta Bello
|
Daniel Hershcovich
|
Miryam de Lhoneux
|
Anders Søgaard
Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas
We evaluated a range of neural machine translation techniques developed specifically for low-resource scenarios. Unsuccessfully. In the end, we submitted two runs: (i) a standard phrase-based model, and (ii) a random babbling baseline using character trigrams. We found that it was surprisingly hard to beat (i), in spite of this model being, in theory, a bad fit for polysynthetic languages; and more interestingly, that (ii) was better than several of the submitted systems, highlighting how difficult low-resource machine translation for polysynthetic languages is.