Sudanese-Flores: Extending FLORES+ to Sudanese Arabic Dialect

Hadia Mohmmedosman Ahmed Samil, David Ifeoluwa Adelani


Abstract
In this work, we introduce Sudanese-Flores, an extension of the popular Flores+ machine translation (MT) benchmark to the Sudanese Arabic dialect. We translate both the DEV and DEVTEST splits of the Modern Standard Arabic dataset into the corresponding Sudanese dialect, resulting in a total of 2,009 sentences. While the dialect was recently introduced in Google Translate, there are no available benchmark in this dialect despite spoken by over 40 million people. Our evaluation on two leading LLMs such as GPT-4.1 and Gemini 2.5 Flash showed that while the performance English to Arabic is impressive (more than 23 BLEU), they struggle on Sudanese dialect (less than 11 BLEU) in zero-shot settings. In few-shot scenario, we achieved only a slight boost in performance.
Anthology ID:
2026.africanlp-main.25
Volume:
Proceedings of the 7th Workshop on African Natural Language Processing (AfricaNLP 2026)
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Everlyn Asiko Chimoto, Constantine Lignos, Shamsuddeen Muhammad, Idris Abdulmumin, Clemencia Siro, David Ifeoluwa Adelani
Venues:
AfricaNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
243–247
Language:
URL:
https://aclanthology.org/2026.africanlp-main.25/
DOI:
Bibkey:
Cite (ACL):
Hadia Mohmmedosman Ahmed Samil and David Ifeoluwa Adelani. 2026. Sudanese-Flores: Extending FLORES+ to Sudanese Arabic Dialect. In Proceedings of the 7th Workshop on African Natural Language Processing (AfricaNLP 2026), pages 243–247, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
Sudanese-Flores: Extending FLORES+ to Sudanese Arabic Dialect (Samil & Adelani, AfricaNLP 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.africanlp-main.25.pdf