Engineer Bainomugisha
2026
SALT-31: A Machine Translation Benchmark Dataset for 31 Ugandan Languages
Solomon Nsumba | Benjamin Akera | Evelyn Nafula Ouma | Medadi E. Ssentanda | Deo Kawalya | Engineer Bainomugisha | Ernest Tonny Mwebaze | John Quinn
Proceedings of the 7th Workshop on African Natural Language Processing (AfricaNLP 2026)
Solomon Nsumba | Benjamin Akera | Evelyn Nafula Ouma | Medadi E. Ssentanda | Deo Kawalya | Engineer Bainomugisha | Ernest Tonny Mwebaze | John Quinn
Proceedings of the 7th Workshop on African Natural Language Processing (AfricaNLP 2026)
We present the SALT-31 benchmark dataset for evaluation of machine translation models covering 31 Ugandan languages. Unlike sentence-level evaluation sets, SALT-31 is constructed from short, scenario-driven mini-dialogues designed to preserve discourse context, pragmatics, and culturally grounded communication patterns common in everyday Ugandan settings. The dataset contains 100 English sentences organized into 20 typical communication scenarios, each represented as a five-sentence mini-sequence. It can therefore be used to evaluate both sentence-level and paragraph level machine translation, and includes nearly every language spoken in a country with high linguistic diversity. It is available at https://huggingface.co/datasets/Sunbird/salt-31