Evaluating WMT 2025 Metrics Shared Task Submissions on the SSA-MTE African Challenge Set

Senyu Li; Felermino Dario Mario Ali; Jiayi Wang; Rui Sousa-Silva; Henrique Lopes Cardoso; Pontus Stenetorp; Colin Cherry; David Ifeoluwa Adelani

Evaluating WMT 2025 Metrics Shared Task Submissions on the SSA-MTE African Challenge Set

Senyu Li, Felermino Dario Mario Ali, Jiayi Wang, Rui Sousa-Silva, Henrique Lopes Cardoso, Pontus Stenetorp, Colin Cherry, David Ifeoluwa Adelani

Abstract

This paper presents the evaluation of submissions to the WMT 2025 Metrics Shared Task on the SSA-MTE challenge set, a large-scale benchmark for machine translation evaluation (MTE) in Sub-Saharan African languages. The SSA-MTE test sets contains over 12,768 human-annotated adequacy scores across 11 language pairs sourced from English, French, and Portuguese, spanning 6 commercial and open-source MT systems. Results show that correlations with human judgments remain generally low, with most systems falling below the 0.4 Spearman threshold for medium-level agreement. Performance varies widely across language pairs, with most correlations under 0.4; in some extremely low-resource cases, such as Portuguese–Emakhuwa, correlations drop to around 0.1, underscoring the difficulty of evaluating MT for very low-resource African languages. These findings highlight the urgent need for more research on robust, generalizable MT evaluation methods tailored for African languages.

Anthology ID:: 2025.wmt-1.65
Volume:: Proceedings of the Tenth Conference on Machine Translation
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Barry Haddow, Tom Kocmi, Philipp Koehn, Christof Monz
Venue:: WMT
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 913–919
Language:
URL:: https://aclanthology.org/2025.wmt-1.65/
DOI:
Bibkey:
Cite (ACL):: Senyu Li, Felermino Dario Mario Ali, Jiayi Wang, Rui Sousa-Silva, Henrique Lopes Cardoso, Pontus Stenetorp, Colin Cherry, and David Ifeoluwa Adelani. 2025. Evaluating WMT 2025 Metrics Shared Task Submissions on the SSA-MTE African Challenge Set. In Proceedings of the Tenth Conference on Machine Translation, pages 913–919, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Evaluating WMT 2025 Metrics Shared Task Submissions on the SSA-MTE African Challenge Set (Li et al., WMT 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.wmt-1.65.pdf

PDF Cite Search Fix data