Artem Chervyakov
2026
Multimodal Evaluation of Russian-language Architectures
Artem Chervyakov | Ulyana Isaeva | Anton Emelyanov | Artem Safin | Maria Tikhonova | Alexander Kharitonov | Yulia Lyakh | Petr Surovtsev | Denis Shevelev | Vildan Saburov | Vasily Konovalov | Elisei Rykov | Ivan Sviridov | Amina Miftakhova | Ilseyar Alimova | Alexander Panchenko | Alexander Kapitanov | Alena Fenogenova
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Artem Chervyakov | Ulyana Isaeva | Anton Emelyanov | Artem Safin | Maria Tikhonova | Alexander Kharitonov | Yulia Lyakh | Petr Surovtsev | Denis Shevelev | Vildan Saburov | Vasily Konovalov | Elisei Rykov | Ivan Sviridov | Amina Miftakhova | Ilseyar Alimova | Alexander Panchenko | Alexander Kapitanov | Alena Fenogenova
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Multimodal large language models (MLLMs) are currently at the center of research attention, showing rapid progress in scale and capabilities, yet their intelligence, limitations, and risks remain insufficiently understood. To address these issues, particularly in the context of the Russian language, where no multimodal benchmarks currently exist, we introduce MERA Multi, an open multimodal evaluation framework for Russian-spoken architectures. The benchmark is instruction-based and encompasses default text, image, audio, and video modalities, comprising 18 newly constructed evaluation tasks for both general-purpose models and modality-specific architectures (image-to-text, video-to-text, and audio-to-text). Our contributions include: (i) a universal taxonomy of multimodal abilities; (ii) 18 datasets created entirely from scratch with attention to Russian cultural and linguistic specificity, unified prompts, and metrics; (iii) baseline results for both closed-source and open-source models; (iv) a methodology for preventing benchmark leakage, including watermarking for private sets. While our current focus is on Russian, the proposed benchmark provides a replicable methodology for constructing multimodal benchmarks in typologically diverse languages, particularly within the Slavic language family.
From Standard Transformers to Modern LLMs: Bringing Dialogue Models, RAG, and Agents to the Classroom
Maria Tikhonova | Viktoriia A. Chekalina | Artem Chervyakov | Alexey Zaytsev | Alexander Panchenko
Proceedings of the Seventh Workshop on Teaching Natural Language Processing (TeachNLP 2026)
Maria Tikhonova | Viktoriia A. Chekalina | Artem Chervyakov | Alexey Zaytsev | Alexander Panchenko
Proceedings of the Seventh Workshop on Teaching Natural Language Processing (TeachNLP 2026)
Modern LLM education is increasingly centered on system building: grounding generation with retrieval, enabling tool use, and deploying models under latency and cost constraints.We present an updated release of our open course on Transformer-based LLMs and multimodal models (Nikishina et al, 2024).The update introduces topics which became importance since the first edition, namely session on Retrieval Augmented Generation (RAG), a hands-on session on tool-using agents, an API-based track for applied work with LLM, and practical local inference with vLLM.We also add a dedicated session on multimodal dialog models with a focus on dialog grounding. We enriched the course with a discussion on long-context transformers, focusing on KV-cache efficiency along with the related models and benchmarks.All materials are released online.
2025
GigaChat Family: Efficient Russian Language Modeling Through Mixture of Experts Architecture
Valentin Mamedov | Evgenii Kosarev | Gregory Leleytner | Ilya Shchuckin | Valeriy Berezovskiy | Daniil Smirnov | Dmitry Kozlov | Sergei Averkiev | Lukyanenko Ivan | Aleksandr Proshunin | Ainur Israfilova | Ivan Baskov | Artem Chervyakov | Emil Shakirov | Mikhail Kolesov | Daria Khomich | Daria Latortseva | Sergei Porkhun | Yury Fedorov | Oleg Kutuzov | Polina Kudriavtseva | Sofiia Soldatova | Kolodin Egor | Stanislav Pyatkin | Dzmitry Menshykh | Grafov Sergei IUrevich | Eldar Damirov | Vladimir Karlov | Ruslan Gaitukiev | Arkadiy Shatenov | Alena Fenogenova | Nikita Savushkin | Fedor Minkin
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Valentin Mamedov | Evgenii Kosarev | Gregory Leleytner | Ilya Shchuckin | Valeriy Berezovskiy | Daniil Smirnov | Dmitry Kozlov | Sergei Averkiev | Lukyanenko Ivan | Aleksandr Proshunin | Ainur Israfilova | Ivan Baskov | Artem Chervyakov | Emil Shakirov | Mikhail Kolesov | Daria Khomich | Daria Latortseva | Sergei Porkhun | Yury Fedorov | Oleg Kutuzov | Polina Kudriavtseva | Sofiia Soldatova | Kolodin Egor | Stanislav Pyatkin | Dzmitry Menshykh | Grafov Sergei IUrevich | Eldar Damirov | Vladimir Karlov | Ruslan Gaitukiev | Arkadiy Shatenov | Alena Fenogenova | Nikita Savushkin | Fedor Minkin
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Generative large language models (LLMs) have become crucial for modern NLP research and applications across various languages. However, the development of foundational models specifically tailored to the Russian language has been limited, primarily due to the significant computational resources required. This paper introduces the GigaChat family of Russian LLMs, available in various sizes, including base models and instruction-tuned versions. We provide a detailed report on the model architecture, pre-training process, and experiments to guide design choices. In addition, we evaluate their performance on Russian and English benchmarks and compare GigaChat with multilingual analogs. The paper presents a system demonstration of the top-performing models accessible via an API, a Telegram bot, and a Web interface. Furthermore, we have released three open GigaChat models in open-source, aiming to expand NLP research opportunities and support the development of industrial solutions for the Russian language.
2024
MERA: A Comprehensive LLM Evaluation in Russian
Alena Fenogenova | Artem Chervyakov | Nikita Martynov | Anastasia Kozlova | Maria Tikhonova | Albina Akhmetgareeva | Anton Emelyanov | Denis Shevelev | Pavel Lebedev | Leonid Sinev | Ulyana Isaeva | Katerina Kolomeytseva | Daniil Moskovskiy | Elizaveta Goncharova | Nikita Savushkin | Polina Mikhailova | Anastasia Minaeva | Denis Dimitrov | Alexander Panchenko | Sergey Markov
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Alena Fenogenova | Artem Chervyakov | Nikita Martynov | Anastasia Kozlova | Maria Tikhonova | Albina Akhmetgareeva | Anton Emelyanov | Denis Shevelev | Pavel Lebedev | Leonid Sinev | Ulyana Isaeva | Katerina Kolomeytseva | Daniil Moskovskiy | Elizaveta Goncharova | Nikita Savushkin | Polina Mikhailova | Anastasia Minaeva | Denis Dimitrov | Alexander Panchenko | Sergey Markov
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Over the past few years, one of the most notable advancements in AI research has been in foundation models (FMs), headlined by the rise of language models (LMs). However, despite researchers’ attention and the rapid growth in LM application, the capabilities, limitations, and associated risks still need to be better understood. To address these issues, we introduce a new instruction benchmark, MERA, oriented towards the FMs’ performance on the Russian language. The benchmark encompasses 21 evaluation tasks for generative models covering 10 skills and is supplied with private answer scoring to prevent data leakage. The paper introduces a methodology to evaluate FMs and LMs in fixed zero- and few-shot instruction settings that can be extended to other modalities. We propose an evaluation methodology, an open-source code base for the MERA assessment, and a leaderboard with a submission system. We evaluate open LMs as baselines and find they are still far behind the human level. We publicly release MERA to guide forthcoming research, anticipate groundbreaking model features, standardize the evaluation procedure, and address potential ethical concerns and drawbacks.
Search
Fix author
Co-authors
- Alena Fenogenova 3
- Alexander Panchenko 3
- Maria Tikhonova 3
- Anton Emelyanov 2
- Ulyana Isaeva 2
- Nikita Savushkin 2
- Denis Shevelev 2
- Albina Akhmetgareeva 1
- Ilseyar Alimova 1
- Sergei Averkiev 1
- Ivan Baskov 1
- Valeriy Berezovskiy 1
- Viktoriia A. Chekalina 1
- Eldar Damirov 1
- Denis Dimitrov 1
- Kolodin Egor 1
- Yury Fedorov 1
- Ruslan Gaitukiev 1
- Elizaveta Goncharova 1
- Grafov Sergei IUrevich 1
- Ainur Israfilova 1
- Lukyanenko Ivan 1
- Alexander Kapitanov 1
- Vladimir Karlov 1
- Alexander Kharitonov 1
- Daria Khomich 1
- Mikhail Kolesov 1
- Katerina Kolomeytseva 1
- Vasily Konovalov 1
- Evgenii Kosarev 1
- Dmitry Kozlov 1
- Anastasia Kozlova 1
- Polina Kudriavtseva 1
- Oleg Kutuzov 1
- Daria Latortseva 1
- Pavel Lebedev 1
- Gregory Leleytner 1
- Yulia Lyakh 1
- Valentin Mamedov 1
- Sergey Markov 1
- Nikita Martynov 1
- Dzmitry Menshykh 1
- Amina Miftakhova 1
- Polina Mikhailova 1
- Anastasia Minaeva 1
- Fedor Minkin 1
- Daniil Moskovskiy 1
- Sergei Porkhun 1
- Aleksandr Proshunin 1
- Stanislav Pyatkin 1
- Elisei Rykov 1
- Vildan Saburov 1
- Artem Safin 1
- Emil Shakirov 1
- Arkadiy Shatenov 1
- Ilya Shchuckin 1
- Leonid Sinev 1
- Daniil Smirnov 1
- Sofiia Soldatova 1
- Petr Surovtsev 1
- Ivan Sviridov 1
- Alexey Zaytsev 1