Tracking the evolution of LLM capabilities for Belarusian with OpenAI Evals

Vladislav Poritski, Oksana Volchek, Maksim Aparovich, Volha Harytskaya, Pavel Smrz


Abstract
We examine how the capabilities of large language models (LLMs) have evolved on eight Belarusian language tasks contributed in 2023 to OpenAI’s Evals framework. We evaluate state-of-the-art models both on the original development sets and newly created test sets. Results demonstrate significant but non-uniform progress over this period: some tasks are almost saturated, while others show minor improvement beyond trivial baselines. Error analysis shows that certain challenges haven’t yet been addressed, e.g. misidentification of non-words as legitimate vocabulary, or conversion from modern to classical orthography. We release the datasets and the generated completions (https://doi.org/10.5281/zenodo.18163825).
Anthology ID:
2026.loreslm-1.33
Volume:
Proceedings of the Second Workshop on Language Models for Low-Resource Languages (LoResLM 2026)
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Hansi Hettiarachchi, Tharindu Ranasinghe, Alistair Plum, Paul Rayson, Ruslan Mitkov, Mohamed Gaber, Damith Premasiri, Fiona Anting Tan, Lasitha Uyangodage
Venue:
LoResLM
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
378–387
Language:
URL:
https://aclanthology.org/2026.loreslm-1.33/
DOI:
Bibkey:
Cite (ACL):
Vladislav Poritski, Oksana Volchek, Maksim Aparovich, Volha Harytskaya, and Pavel Smrz. 2026. Tracking the evolution of LLM capabilities for Belarusian with OpenAI Evals. In Proceedings of the Second Workshop on Language Models for Low-Resource Languages (LoResLM 2026), pages 378–387, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
Tracking the evolution of LLM capabilities for Belarusian with OpenAI Evals (Poritski et al., LoResLM 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.loreslm-1.33.pdf