Anton Emelyanov


2024

pdf bib
MERA: A Comprehensive LLM Evaluation in Russian
Alena Fenogenova | Artem Chervyakov | Nikita Martynov | Anastasia Kozlova | Maria Tikhonova | Albina Akhmetgareeva | Anton Emelyanov | Denis Shevelev | Pavel Lebedev | Leonid Sinev | Ulyana Isaeva | Katerina Kolomeytseva | Daniil Moskovskiy | Elizaveta Goncharova | Nikita Savushkin | Polina Mikhailova | Anastasia Minaeva | Denis Dimitrov | Alexander Panchenko | Sergey Markov
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Over the past few years, one of the most notable advancements in AI research has been in foundation models (FMs), headlined by the rise of language models (LMs). However, despite researchers’ attention and the rapid growth in LM application, the capabilities, limitations, and associated risks still need to be better understood. To address these issues, we introduce a new instruction benchmark, MERA, oriented towards the FMs’ performance on the Russian language. The benchmark encompasses 21 evaluation tasks for generative models covering 10 skills and is supplied with private answer scoring to prevent data leakage. The paper introduces a methodology to evaluate FMs and LMs in fixed zero- and few-shot instruction settings that can be extended to other modalities. We propose an evaluation methodology, an open-source code base for the MERA assessment, and a leaderboard with a submission system. We evaluate open LMs as baselines and find they are still far behind the human level. We publicly release MERA to guide forthcoming research, anticipate groundbreaking model features, standardize the evaluation procedure, and address potential ethical concerns and drawbacks.

2020

pdf bib
Humans Keep It One Hundred: an Overview of AI Journey
Tatiana Shavrina | Anton Emelyanov | Alena Fenogenova | Vadim Fomin | Vladislav Mikhailov | Andrey Evlampiev | Valentin Malykh | Vladimir Larin | Alex Natekin | Aleksandr Vatulin | Peter Romov | Daniil Anastasiev | Nikolai Zinov | Andrey Chertok
Proceedings of the Twelfth Language Resources and Evaluation Conference

Artificial General Intelligence (AGI) is showing growing performance in numerous applications - beating human performance in Chess and Go, using knowledge bases and text sources to answer questions (SQuAD) and even pass human examination (Aristo project). In this paper, we describe the results of AI Journey, a competition of AI-systems aimed to improve AI performance on knowledge bases, reasoning and text generation. Competing systems pass the final native language exam (in Russian), including versatile grammar tasks (test and open questions) and an essay, achieving a high score of 69%, with 68% being an average human result. During the competition, a baseline for the task and essay parts was proposed, and 80+ systems were submitted, showing different approaches to task understanding and reasoning. All the data and solutions can be found on github https://github.com/sberbank-ai/combined_solution_aij2019

2019

pdf bib
Multilingual Named Entity Recognition Using Pretrained Embeddings, Attention Mechanism and NCRF
Anton Emelyanov | Ekaterina Artemova
Proceedings of the 7th Workshop on Balto-Slavic Natural Language Processing

In this paper we tackle multilingual named entity recognition task. We use the BERT Language Model as embeddings with bidirectional recurrent network, attention, and NCRF on the top. We apply multilingual BERT only as embedder without any fine-tuning. We test out model on the dataset of the BSNLP shared task, which consists of texts in Bulgarian, Czech, Polish and Russian languages.