Natural Language Processing in Ethiopian Languages: Current State, Challenges, and Opportunities

Atnafu Lambebo Tonja, Tadesse Destaw Belay, Israel Abebe Azime, Abinew Ali Ayele, Moges Ahmed Mehamed, Olga Kolesnikova, Seid Muhie Yimam


Abstract
This survey delves into the current state of natural language processing (NLP) for four Ethiopian languages: Amharic, Afaan Oromo, Tigrinya, and Wolaytta. Through this paper, we identify key challenges and opportunities for NLP research in Ethiopia.Furthermore, we provide a centralized repository on GitHub that contains publicly available resources for various NLP tasks in these languages. This repository can be updated periodically with contributions from other researchers. Our objective is to disseminate information to NLP researchers interested in Ethiopian languages and encourage future research in this domain.
Anthology ID:
2023.rail-1.14
Volume:
Proceedings of the Fourth workshop on Resources for African Indigenous Languages (RAIL 2023)
Month:
May
Year:
2023
Address:
Dubrovnik, Croatia
Editors:
Rooweither Mabuya, Don Mthobela, Mmasibidi Setaka, Menno Van Zaanen
Venue:
RAIL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
126–139
Language:
URL:
https://aclanthology.org/2023.rail-1.14
DOI:
10.18653/v1/2023.rail-1.14
Bibkey:
Cite (ACL):
Atnafu Lambebo Tonja, Tadesse Destaw Belay, Israel Abebe Azime, Abinew Ali Ayele, Moges Ahmed Mehamed, Olga Kolesnikova, and Seid Muhie Yimam. 2023. Natural Language Processing in Ethiopian Languages: Current State, Challenges, and Opportunities. In Proceedings of the Fourth workshop on Resources for African Indigenous Languages (RAIL 2023), pages 126–139, Dubrovnik, Croatia. Association for Computational Linguistics.
Cite (Informal):
Natural Language Processing in Ethiopian Languages: Current State, Challenges, and Opportunities (Tonja et al., RAIL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.rail-1.14.pdf
Video:
 https://aclanthology.org/2023.rail-1.14.mp4