Csilla Horváth
2025
A Mansi FST and spellchecker
Jack Rueter
|
Csilla Horváth
|
Trond Trosterud
Proceedings of the 9th Workshop on Constraint Grammar and Finite State NLP
The article presents a finite state transducer and spellchecker for Mansi, an Ob-Ugric Uralic language spoken in northwestern Siberia. Mansi has a rich but mostly agglutinative morphology, with a morphophonology dominated by sandhi phenomena. With a small set of morphophonological rules (32 twolc rules) and a lexicon consisting of 12,000 Mansi entries and a larger set of propernouns we were able to build a transducer covering 98.9 % of a large (700k) newspaper corpus. Being a part of the GiellaLT infrastructure, the transducer was turned into a spellchecker. The most common spelling error in Mansi is the omission of length marks on vowels, and for the 1000 most common words containing long vowels, the spellchecker was able to give a correct suggestion as top-five in 98.3 % of the cases, and as first suggestion in 91.3 % of the cases.
2020
apPILcation: an Android-based Tool for Learning Mansi
Gábor Bobály
|
Csilla Horváth
|
Veronika Vincze
Proceedings of the Sixth International Workshop on Computational Linguistics of Uralic Languages
2017
Language technology resources and tools for Mansi: an overview
Csilla Horváth
|
Norbert Szilágyi
|
Veronika Vincze
|
Ágoston Nagy
Proceedings of the Third Workshop on Computational Linguistics for Uralic Languages
2016
Where Bears Have the Eyes of Currant: Towards a Mansi WordNet
Csilla Horváth
|
Ágoston Nagy
|
Norbert Szilágyi
|
Veronika Vincze
Proceedings of the 8th Global WordNet Conference (GWC)
Here we report the construction of a wordnet for Mansi, an endangered minority language spoken in Russia. We will pay special attention to challenges that we encountered during the building process, among which the most important ones are the low number of native speakers, the lack of thesauri and the bear language. We will discuss our solutions to these issues, which might have some theoretical implications for the methodology of wordnet building in general.
Search
Fix data
Co-authors
- Veronika Vincze 3
- Ágoston Nagy 2
- Norbert Szilágyi 2
- Gábor Bobály 1
- Jack Rueter 1
- show all...