Automatically Generating IsiZulu Words From Indo-Arabic Numerals

Zola Mahlaza; Tadiwa Magwenzi; C. Maria Keet; Langa Khumalo

doi:10.18653/v1/2024.inlg-main.21

Automatically Generating IsiZulu Words From Indo-Arabic Numerals

Zola Mahlaza, Tadiwa Magwenzi, C. Maria Keet, Langa Khumalo

Abstract

Artificial conversational agents are deployed to assist humans in a variety of tasks. Some of these tasks require the capability to communicate numbers as part of their internal and abstract representations of meaning, such as for banking and scheduling appointments. They currently cannot do so for isiZulu because there are no algorithms to do so due to a lack of speech and text data and the transformation is complex and it may include dependence on the type of noun that is counted. We solved this by extracting and iteratively improving on the rules for speaking and writing numerals as words and creating two algorithms to automate the transformation. Evaluation of the algorithms by two isiZulu grammarians showed that six out of seven number categories were 90-100% correct. The same software was used with an additional set of rules to create a large monolingual text corpus, made up of 771 643 sentences, to enable future data-driven approaches.

Anthology ID:: 2024.inlg-main.21
Volume:: Proceedings of the 17th International Natural Language Generation Conference
Month:: September
Year:: 2024
Address:: Tokyo, Japan
Editors:: Saad Mahamood, Nguyen Le Minh, Daphne Ippolito
Venue:: INLG
SIG:: SIGGEN
Publisher:: Association for Computational Linguistics
Note:
Pages:: 254–271
Language:
URL:: https://aclanthology.org/2024.inlg-main.21/
DOI:: 10.18653/v1/2024.inlg-main.21
Bibkey:
Cite (ACL):: Zola Mahlaza, Tadiwa Magwenzi, C. Maria Keet, and Langa Khumalo. 2024. Automatically Generating IsiZulu Words From Indo-Arabic Numerals. In Proceedings of the 17th International Natural Language Generation Conference, pages 254–271, Tokyo, Japan. Association for Computational Linguistics.
Cite (Informal):: Automatically Generating IsiZulu Words From Indo-Arabic Numerals (Mahlaza et al., INLG 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.inlg-main.21.pdf

PDF Cite Search Fix data