Noun Class Disambiguation in Runyankore and Related Languages

Joan Byamugisha


Abstract
Bantu languages are spoken by communities in more than half of the countries on the African continent by an estimated third of a billion people. Despite this populous and the amount of high quality linguistic research done over the years, Bantu languages are still computationally under-resourced. The biggest limitation to the development of computational methods for processing Bantu language text is their complex grammatical structure, chiefly in the system of noun classes. We investigated the use of a combined syntactic and semantic method to disambiguate among singular nouns with the same class prefix but belonging to different noun classes. This combination uses the semantic generalizations of the types of nouns in each class to overcome the limitations of relying only on the prefixes they take. We used the nearest neighbors of a query word as semantic generalizations, and developed a tool to determine the noun class based on resources in Runyankore, a Bantu language indigenous to Uganda. We also investigated whether, with the same Runyankore resources, our method had utility in other Bantu languages, Luganda, indigenous to Uganda, and Kinyarwanda, indigenous to Rwanda. For all three languages, the combined approach resulted in an improvement in accuracy, as compared to using only the syntactic or the semantic approach.
Anthology ID:
2022.coling-1.383
Volume:
Proceedings of the 29th International Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Editors:
Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, Seung-Hoon Na
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
4350–4359
Language:
URL:
https://aclanthology.org/2022.coling-1.383
DOI:
Bibkey:
Cite (ACL):
Joan Byamugisha. 2022. Noun Class Disambiguation in Runyankore and Related Languages. In Proceedings of the 29th International Conference on Computational Linguistics, pages 4350–4359, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):
Noun Class Disambiguation in Runyankore and Related Languages (Byamugisha, COLING 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.coling-1.383.pdf