Developing finite-state language technology for Maya

Robert Pugh, Francis Tyers, Quetzil Castañeda


Abstract
We describe a suite of finite-state language technologies for Maya, a Mayan language spoken in Mexico. At the core is a computational model of Maya morphology and phonology using a finite-state transducer. This model results in a morphological analyzer and a morphologically-informed spell-checker. All of these technologies are designed for use as both a pedagogical reading/writing aid for L2 learners and as a general language processing tool capable of supporting much of the natural variation in written Maya. We discuss the relevant features of Maya morphosyntax and orthography, and then outline the implementation details of the analyzer. To conclude, we present a longer-term vision for these tools and their use by both native speakers and learners.
Anthology ID:
2023.americasnlp-1.5
Volume:
Proceedings of the Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Manuel Mager, Abteen Ebrahimi, Arturo Oncevay, Enora Rice, Shruti Rijhwani, Alexis Palmer, Katharina Kann
Venue:
AmericasNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
30–39
Language:
URL:
https://aclanthology.org/2023.americasnlp-1.5
DOI:
10.18653/v1/2023.americasnlp-1.5
Bibkey:
Cite (ACL):
Robert Pugh, Francis Tyers, and Quetzil Castañeda. 2023. Developing finite-state language technology for Maya. In Proceedings of the Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP), pages 30–39, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Developing finite-state language technology for Maya (Pugh et al., AmericasNLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.americasnlp-1.5.pdf