When LLMs Struggle: Reference-less Translation Evaluation for Low-resource Languages

Archchana Sindhujan, Diptesh Kanojia, Constantin Orasan, Shenbin Qian


Abstract
This paper investigates the reference-less evaluation of machine translation for low-resource language pairs, known as quality estimation (QE). Segment-level QE is a challenging cross-lingual language understanding task that provides a quality score (0 -100) to the translated output. We comprehensively evaluate large language models (LLMs) in zero/few-shot scenarios and perform instruction fine-tuning using a novel prompt based on annotation guidelines. Our results indicate that prompt-based approaches are outperformed by the encoder-based fine-tuned QE models. Our error analysis reveals tokenization issues, along with errors due to transliteration and named entities, and argues for refinement in LLM pre-training for cross-lingual tasks. We release the data, and models trained publicly for further research.
Anthology ID:
2025.loreslm-1.33
Volume:
Proceedings of the First Workshop on Language Models for Low-Resource Languages
Month:
January
Year:
2025
Address:
Abu Dhabi, United Arab Emirates
Editors:
Hansi Hettiarachchi, Tharindu Ranasinghe, Paul Rayson, Ruslan Mitkov, Mohamed Gaber, Damith Premasiri, Fiona Anting Tan, Lasitha Uyangodage
Venues:
LoResLM | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
437–459
Language:
URL:
https://aclanthology.org/2025.loreslm-1.33/
DOI:
Bibkey:
Cite (ACL):
Archchana Sindhujan, Diptesh Kanojia, Constantin Orasan, and Shenbin Qian. 2025. When LLMs Struggle: Reference-less Translation Evaluation for Low-resource Languages. In Proceedings of the First Workshop on Language Models for Low-Resource Languages, pages 437–459, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
When LLMs Struggle: Reference-less Translation Evaluation for Low-resource Languages (Sindhujan et al., LoResLM 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.loreslm-1.33.pdf