Language technology for the minority Finnic languages

Flammie A Pirinen, Trond Trosterud, Jack Rueter


Abstract
This article gives an overview of the state of the art in language technology tools for Balto-Finnic minority languages, i.e., Balto-Finnic languages other than Estonian and Finnish. For simplicity, we will use the term Finnic in this article when referring to all members of this language branch except the Estonian and Finnish literary languages. All in all, there are nine standardised languages represented in existing language technology infrastructures with keyboards, grammatical language models, proofing tools, annotated corpora and (for one of the langauges) extensive ICALL programs. This article presents these tools and resources, discusses the relation between language models and proofing tool quality, as well as the (potential) impact of these tools on the respective language communities. The article rounds off with a discussion on prospects for future development.
Anthology ID:
2025.iwclul-1.6
Volume:
Proceedings of the 10th International Workshop on Computational Linguistics for Uralic Languages
Month:
December
Year:
2025
Address:
Joensuu, Finland
Editors:
Mika Hämäläinen, Michael Rießler, Eiaki V. Morooka, Lev Kharlashkin
Venues:
IWCLUL | WS
SIG:
SIGUR
Publisher:
Association for Computational Linguistics
Note:
Pages:
39–48
Language:
URL:
https://aclanthology.org/2025.iwclul-1.6/
DOI:
Bibkey:
Cite (ACL):
Flammie A Pirinen, Trond Trosterud, and Jack Rueter. 2025. Language technology for the minority Finnic languages. In Proceedings of the 10th International Workshop on Computational Linguistics for Uralic Languages, pages 39–48, Joensuu, Finland. Association for Computational Linguistics.
Cite (Informal):
Language technology for the minority Finnic languages (Pirinen et al., IWCLUL 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.iwclul-1.6.pdf