Spell-Checking based on Syllabification and Character-level Graphs for a Peruvian Agglutinative Language

Carlo Alva, Arturo Oncevay


Abstract
There are several native languages in Peru which are mostly agglutinative. These languages are transmitted from generation to generation mainly in oral form, causing different forms of writing across different communities. For this reason, there are recent efforts to standardize the spelling in the written texts, and it would be beneficial to support these tasks with an automatic tool such as an spell-checker. In this way, this spelling corrector is being developed based on two steps: an automatic rule-based syllabification method and a character-level graph to detect the degree of error in a misspelled word. The experiments were realized on Shipibo-konibo, a highly agglutinative and amazonian language, and the results obtained have been promising in a dataset built for the purpose.
Anthology ID:
W17-4116
Volume:
Proceedings of the First Workshop on Subword and Character Level Models in NLP
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Venue:
SCLeM
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
109–116
Language:
URL:
https://aclanthology.org/W17-4116
DOI:
10.18653/v1/W17-4116
Bibkey:
Cite (ACL):
Carlo Alva and Arturo Oncevay. 2017. Spell-Checking based on Syllabification and Character-level Graphs for a Peruvian Agglutinative Language. In Proceedings of the First Workshop on Subword and Character Level Models in NLP, pages 109–116, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
Spell-Checking based on Syllabification and Character-level Graphs for a Peruvian Agglutinative Language (Alva & Oncevay, SCLeM 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-4116.pdf