Language-Independent Approach for Morphological Disambiguation

Alymzhan Toleu, Gulmira Tolegen, Rustam Mussabayev


Abstract
This paper presents a language-independent approach for morphological disambiguation which has been regarded as an extension of POS tagging, jointly predicting complex morphological tags. In the proposed approach, all words, roots, POS and morpheme tags are embedded into vectors, and contexts representations from surface word and morphological contexts are calculated. Then the inner products between analyses and the context’s representations are computed to perform the disambiguation. The underlying hypothesis is that the correct morphological analysis should be closer to the context in a vector space. Experimental results show that the proposed approach outperforms the existing models on seven different language datasets. Concretely, compared with the baselines of MarMot and a sophisticated neural model (Seq2Seq), the proposed approach achieves around 6% improvement in average accuracy for all languages while running about 6 and 33 times faster than MarMot and Seq2Seq, respectively.
Anthology ID:
2022.coling-1.470
Volume:
Proceedings of the 29th International Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
5288–5297
Language:
URL:
https://aclanthology.org/2022.coling-1.470
DOI:
Bibkey:
Cite (ACL):
Alymzhan Toleu, Gulmira Tolegen, and Rustam Mussabayev. 2022. Language-Independent Approach for Morphological Disambiguation. In Proceedings of the 29th International Conference on Computational Linguistics, pages 5288–5297, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):
Language-Independent Approach for Morphological Disambiguation (Toleu et al., COLING 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.coling-1.470.pdf