Machine Translation for Livonian: Catering to 20 Speakers

Matīss Rikters, Marili Tomingas, Tuuli Tuisk, Valts Ernštreits, Mark Fishel


Abstract
Livonian is one of the most endangered languages in Europe with just a tiny handful of speakers and virtually no publicly available corpora. In this paper we tackle the task of developing neural machine translation (NMT) between Livonian and English, with a two-fold aim: on one hand, preserving the language and on the other – enabling access to Livonian folklore, lifestories and other textual intangible heritage as well as making it easier to create further parallel corpora. We rely on Livonian’s linguistic similarity to Estonian and Latvian and collect parallel and monolingual data for the four languages for translation experiments. We combine different low-resource NMT techniques like zero-shot translation, cross-lingual transfer and synthetic data creation to reach the highest possible translation quality as well as to find which base languages are empirically more helpful for transfer to Livonian. The resulting NMT systems and the collected monolingual and parallel data, including a manually translated and verified translation benchmark, are publicly released via OPUS and Huggingface repositories.
Anthology ID:
2022.acl-short.55
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
508–514
Language:
URL:
https://aclanthology.org/2022.acl-short.55
DOI:
10.18653/v1/2022.acl-short.55
Bibkey:
Cite (ACL):
Matīss Rikters, Marili Tomingas, Tuuli Tuisk, Valts Ernštreits, and Mark Fishel. 2022. Machine Translation for Livonian: Catering to 20 Speakers. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 508–514, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Machine Translation for Livonian: Catering to 20 Speakers (Rikters et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-short.55.pdf
Video:
 https://aclanthology.org/2022.acl-short.55.mp4