Advancing African NLP: UDMorph and flexiPipe

Maarten Janssen


Abstract
In this paper, we present some of our recent efforts to provide base NLP pipelines for African languages. These include an infrastructure called UDMorph to make UD-compatible training data available for resources that do not have dependency relations, and a Python package called flexiPipe to easily run an NLP pipeline in various NLP tools using a uniform front-end, including the models provided by UDMorph. flexiPipe also provides Unicode normalization, an often overlooked feature that has a significant impact on African NLP. flexiPipe currently provides an NLP pipeline for 33 African languages, a significant increase from the handful of models that are currently easily accessible. And UDMorph is designed to make it easy to provide training data for more languages.
Anthology ID:
2026.africanlp-main.13
Volume:
Proceedings of the 7th Workshop on African Natural Language Processing (AfricaNLP 2026)
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Everlyn Asiko Chimoto, Constantine Lignos, Shamsuddeen Muhammad, Idris Abdulmumin, Clemencia Siro, David Ifeoluwa Adelani
Venues:
AfricaNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
142–148
Language:
URL:
https://aclanthology.org/2026.africanlp-main.13/
DOI:
Bibkey:
Cite (ACL):
Maarten Janssen. 2026. Advancing African NLP: UDMorph and flexiPipe. In Proceedings of the 7th Workshop on African Natural Language Processing (AfricaNLP 2026), pages 142–148, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
Advancing African NLP: UDMorph and flexiPipe (Janssen, AfricaNLP 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.africanlp-main.13.pdf