Anchoring the Judge: Curriculum-Based Adaptation and Reference-Anchored MQM for LLM-Based Machine Translation of an Unseen Low-Resource Language - A Case of Nupe

Umar Baba Umar, Sulaimon Adebayo Bashir, Abdulmalik Danlami Mohammed


Abstract
Adapting large language models (LLMs) for machine translation has shown strong performance in low-resource languages; however, their effectiveness for unseen, extremely low-resource languages remains largely unexplored. We present NupeMT-QLoRA, a curriculum-based adaptation framework for the Nupe–English language pair. Our approach employs a two-stage QLoRA fine-tuning strategy: (i) initial training on 34k noisy parallel sentence pairs, followed by (ii) continued fine-tuning on a smaller, cleaner set of 12k bidirectional parallel sentences with explicit translation-direction tags. This staged curriculum stabilizes optimization and improves robustness under severe data scarcity.We further identify a reliability crisis in existing automatic evaluation metrics for unseen languages. Popular LLM-based judges such as GEMBA and xCOMET exhibit weak correlation with human judgments (Kendall’s 𝜏 ≈ 0.21) and low inter-rater reliability (Fleiss’ 𝜅 ≈ 0.27), largely due to fluency bias. To address this, we propose Ref-Anchor-MQM, a reference-anchored evaluation protocol that forces the judge to extract Key Semantic Units from a human reference before scoring.Experimental results show that NupeMT-QLoRA substantially outperforms NLLB-200, improving chrF++ from 22.73 to 41.10, while Ref-Anchor-MQM achieves significantly higher alignment with human evaluation (𝜏 = 0.71). Our framework provides a scalable pipeline for adapting and evaluating LLMs on languages with zero prior representation.
Anthology ID:
2026.loreslm-1.19
Volume:
Proceedings of the Second Workshop on Language Models for Low-Resource Languages (LoResLM 2026)
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Hansi Hettiarachchi, Tharindu Ranasinghe, Alistair Plum, Paul Rayson, Ruslan Mitkov, Mohamed Gaber, Damith Premasiri, Fiona Anting Tan, Lasitha Uyangodage
Venue:
LoResLM
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
200–211
Language:
URL:
https://aclanthology.org/2026.loreslm-1.19/
DOI:
Bibkey:
Cite (ACL):
Umar Baba Umar, Sulaimon Adebayo Bashir, and Abdulmalik Danlami Mohammed. 2026. Anchoring the Judge: Curriculum-Based Adaptation and Reference-Anchored MQM for LLM-Based Machine Translation of an Unseen Low-Resource Language - A Case of Nupe. In Proceedings of the Second Workshop on Language Models for Low-Resource Languages (LoResLM 2026), pages 200–211, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
Anchoring the Judge: Curriculum-Based Adaptation and Reference-Anchored MQM for LLM-Based Machine Translation of an Unseen Low-Resource Language - A Case of Nupe (Umar et al., LoResLM 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.loreslm-1.19.pdf