An FST morphological analyzer for the Gitksan language

Clarissa Forbes, Garrett Nicolai, Miikka Silfverberg


Abstract
This paper presents a finite-state morphological analyzer for the Gitksan language. The analyzer draws from a 1250-token Eastern dialect wordlist. It is based on finite-state technology and additionally includes two extensions which can provide analyses for out-of-vocabulary words: rules for generating predictable dialect variants, and a neural guesser component. The pre-neural analyzer, tested against interlinear-annotated texts from multiple dialects, achieves coverage of (75-81%), and maintains high precision (95-100%). The neural extension improves coverage at the cost of lowered precision.
Anthology ID:
2021.sigmorphon-1.21
Volume:
Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology
Month:
August
Year:
2021
Address:
Online
Editors:
Garrett Nicolai, Kyle Gorman, Ryan Cotterell
Venue:
SIGMORPHON
SIG:
SIGMORPHON
Publisher:
Association for Computational Linguistics
Note:
Pages:
188–197
Language:
URL:
https://aclanthology.org/2021.sigmorphon-1.21
DOI:
10.18653/v1/2021.sigmorphon-1.21
Bibkey:
Cite (ACL):
Clarissa Forbes, Garrett Nicolai, and Miikka Silfverberg. 2021. An FST morphological analyzer for the Gitksan language. In Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, pages 188–197, Online. Association for Computational Linguistics.
Cite (Informal):
An FST morphological analyzer for the Gitksan language (Forbes et al., SIGMORPHON 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.sigmorphon-1.21.pdf
Video:
 https://aclanthology.org/2021.sigmorphon-1.21.mp4