Long Context Benchmark for the Russian Language

Igor Churin; Murat Apishev; Maria Tikhonova; Denis Shevelev; Aydar Bulatov; Yurii Kuratov; Sergei Averkiev; Alena Fenogenova

doi:10.18653/v1/2025.codi-1.1

Long Context Benchmark for the Russian Language

Igor Churin, Murat Apishev, Maria Tikhonova, Denis Shevelev, Aydar Bulatov, Yuri Kuratov, Sergei Averkiev, Alena Fenogenova

Abstract

Recent progress in Natural Language Processing (NLP) has driven the creation of Large Language Models (LLMs) capable of tackling a vast range of tasks. A critical property of these models is their ability to handle large documents and process long token sequences, which has fostered the need for a robust evaluation methodology for long-text scenarios. To meet this requirement in the context of the Russian language, we present our benchmark consisting of 18 datasets designed to assess LLM performance in tasks such as information retrieval, knowledge extraction, machine reading, question answering, and reasoning. These datasets are categorized into four levels of complexity, enabling model evaluation across context lengths up to 128k tokens. To facilitate further research, we provide open-source datasets, a codebase, and a public leaderboard associated with the benchmark.

Anthology ID:: 2025.codi-1.1
Volume:: Proceedings of the 6th Workshop on Computational Approaches to Discourse, Context and Document-Level Inferences (CODI 2025)
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Michael Strube, Chloe Braud, Christian Hardmeier, Junyi Jessy Li, Sharid Loaiciga, Amir Zeldes, Chuyuan Li
Venues:: CODI | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1–13
Language:
URL:: https://aclanthology.org/2025.codi-1.1/
DOI:: 10.18653/v1/2025.codi-1.1
Bibkey:
Cite (ACL):: Igor Churin, Murat Apishev, Maria Tikhonova, Denis Shevelev, Aydar Bulatov, Yuri Kuratov, Sergei Averkiev, and Alena Fenogenova. 2025. Long Context Benchmark for the Russian Language. In Proceedings of the 6th Workshop on Computational Approaches to Discourse, Context and Document-Level Inferences (CODI 2025), pages 1–13, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Long Context Benchmark for the Russian Language (Churin et al., CODI 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.codi-1.1.pdf
Supplementarymaterial:: 2025.codi-1.1.SupplementaryMaterial.zip

PDF Cite Search Supplementarymaterial Fix data