Neural Machine Translation through Active Learning on low-resource languages: The case of Spanish to Mapudungun

Begoña Pendas, Andres Carvallo, Carlos Aspillaga


Abstract
Active learning is an algorithmic approach that strategically selects a subset of examples for labeling, with the goal of reducing workload and required resources. Previous research has applied active learning to Neural Machine Translation (NMT) for high-resource or well-represented languages, achieving significant reductions in manual labor. In this study, we explore the application of active learning for NMT in the context of Mapudungun, a low-resource language spoken by the Mapuche community in South America. Mapudungun was chosen due to the limited number of fluent speakers and the pressing need to provide access to content predominantly available in widely represented languages. We assess both model-dependent and model-agnostic active learning strategies for NMT between Spanish and Mapudungun in both directions, demonstrating that we can achieve over 40% reduction in manual translation workload in both cases.
Anthology ID:
2023.americasnlp-1.2
Volume:
Proceedings of the Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Manuel Mager, Abteen Ebrahimi, Arturo Oncevay, Enora Rice, Shruti Rijhwani, Alexis Palmer, Katharina Kann
Venue:
AmericasNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6–11
Language:
URL:
https://aclanthology.org/2023.americasnlp-1.2
DOI:
10.18653/v1/2023.americasnlp-1.2
Bibkey:
Cite (ACL):
Begoña Pendas, Andres Carvallo, and Carlos Aspillaga. 2023. Neural Machine Translation through Active Learning on low-resource languages: The case of Spanish to Mapudungun. In Proceedings of the Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP), pages 6–11, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Neural Machine Translation through Active Learning on low-resource languages: The case of Spanish to Mapudungun (Pendas et al., AmericasNLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.americasnlp-1.2.pdf