The Roles of English in Evaluating Multilingual Language Models

Wessel Poelman, Miryam de Lhoneux


Abstract
Multilingual natural language processing is getting increased attention, with numerous models, benchmarks, and methods being released for many languages. English is often used in multilingual evaluation to prompt language models (LMs), mainly to overcome the lack of instruction tuning data in other languages. In this position paper, we lay out two roles of English in multilingual LM evaluations: as an interface and as a natural language. We argue that these roles have different goals: task performance versus language understanding. This discrepancy is highlighted with examples from datasets and evaluation setups. Numerous works explicitly use English as an interface to boost task performance. We recommend to move away from these imprecise methods and instead focus on language understanding.
Anthology ID:
2025.nodalida-1.53
Volume:
Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025)
Month:
march
Year:
2025
Address:
Tallinn, Estonia
Editors:
Richard Johansson, Sara Stymne
Venue:
NoDaLiDa
SIG:
Publisher:
University of Tartu Library
Note:
Pages:
492–498
Language:
URL:
https://aclanthology.org/2025.nodalida-1.53/
DOI:
Bibkey:
Cite (ACL):
Wessel Poelman and Miryam de Lhoneux. 2025. The Roles of English in Evaluating Multilingual Language Models. In Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025), pages 492–498, Tallinn, Estonia. University of Tartu Library.
Cite (Informal):
The Roles of English in Evaluating Multilingual Language Models (Poelman & de Lhoneux, NoDaLiDa 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.nodalida-1.53.pdf