Gleb Kuzmin


pdf bib
Uncertainty Estimation of Transformer Predictions for Misclassification Detection
Artem Vazhentsev | Gleb Kuzmin | Artem Shelmanov | Akim Tsvigun | Evgenii Tsymbalov | Kirill Fedyanin | Maxim Panov | Alexander Panchenko | Gleb Gusev | Mikhail Burtsev | Manvel Avetisian | Leonid Zhukov
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Uncertainty estimation (UE) of model predictions is a crucial step for a variety of tasks such as active learning, misclassification detection, adversarial attack detection, out-of-distribution detection, etc. Most of the works on modeling the uncertainty of deep neural networks evaluate these methods on image classification tasks. Little attention has been paid to UE in natural language processing. To fill this gap, we perform a vast empirical investigation of state-of-the-art UE methods for Transformer models on misclassification detection in named entity recognition and text classification tasks and propose two computationally efficient modifications, one of which approaches or even outperforms computationally intensive methods.

pdf bib
Towards Computationally Feasible Deep Active Learning
Akim Tsvigun | Artem Shelmanov | Gleb Kuzmin | Leonid Sanochkin | Daniil Larionov | Gleb Gusev | Manvel Avetisian | Leonid Zhukov
Findings of the Association for Computational Linguistics: NAACL 2022

Active learning (AL) is a prominent technique for reducing the annotation effort required for training machine learning models. Deep learning offers a solution for several essential obstacles to deploying AL in practice but introduces many others. One of such problems is the excessive computational resources required to train an acquisition model and estimate its uncertainty on instances in the unlabeled pool. We propose two techniques that tackle this issue for text classification and tagging tasks, offering a substantial reduction of AL iteration duration and the computational overhead introduced by deep acquisition models in AL. We also demonstrate that our algorithm that leverages pseudo-labeling and distilled models overcomes one of the essential obstacles revealed previously in the literature. Namely, it was shown that due to differences between an acquisition model used to select instances during AL and a successor model trained on the labeled data, the benefits of AL can diminish. We show that our algorithm, despite using a smaller and faster acquisition model, is capable of training a more expressive successor model with higher performance.


pdf bib
Fake news detection for the Russian language
Gleb Kuzmin | Daniil Larionov | Dina Pisarevskaya | Ivan Smirnov
Proceedings of the 3rd International Workshop on Rumours and Deception in Social Media (RDSM)

In this paper, we trained and compared different models for fake news detection in Russian. For this task, we used such language features as bag-of-n-grams and bag of Rhetorical Structure Theory features, and BERT embeddings. We also compared the score of our models with the human score on this task and showed that our models deal with fake news detection better. We investigated the nature of fake news by dividing it into two non-overlapping classes: satire and fake news. As a result, we obtained the set of models for fake news detection; the best of these models achieved 0.889 F1-score on the test set for 2 classes and 0.9076 F1-score on 3 classes task.