Ayuuk-Spanish Neural Machine Translator

Delfino Zacarías Márquez, Ivan Vladimir Meza Ruiz


Abstract
This paper presents the first neural machine translator system for the Ayuuk language. In our experiments we translate from Ayuuk to Spanish, and fromSpanish to Ayuuk. Ayuuk is a language spoken in the Oaxaca state of Mexico by the Ayuukjä’äy people (in Spanish commonly known as Mixes. We use different sources to create a low-resource parallel corpus, more than 6,000 phrases. For some of these resources we rely on automatic alignment. The proposed system is based on the Transformer neural architecture and it uses sub-word level tokenization as the input. We show the current performance given the resources we have collected for the San Juan Güichicovi variant, they are promising, up to 5 BLEU. We based our development on the Masakhane project for African languages.
Anthology ID:
2021.americasnlp-1.19
Volume:
Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas
Month:
June
Year:
2021
Address:
Online
Editors:
Manuel Mager, Arturo Oncevay, Annette Rios, Ivan Vladimir Meza Ruiz, Alexis Palmer, Graham Neubig, Katharina Kann
Venue:
AmericasNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
168–172
Language:
URL:
https://aclanthology.org/2021.americasnlp-1.19
DOI:
10.18653/v1/2021.americasnlp-1.19
Bibkey:
Cite (ACL):
Delfino Zacarías Márquez and Ivan Vladimir Meza Ruiz. 2021. Ayuuk-Spanish Neural Machine Translator. In Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas, pages 168–172, Online. Association for Computational Linguistics.
Cite (Informal):
Ayuuk-Spanish Neural Machine Translator (Zacarías Márquez & Meza Ruiz, AmericasNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.americasnlp-1.19.pdf