Ivan Oseledets


pdf bib
Layerwise universal adversarial attack on NLP models
Olga Tsymboi | Danil Malaev | Andrei Petrovskii | Ivan Oseledets
Findings of the Association for Computational Linguistics: ACL 2023

In this work, we examine the vulnerability of language models to universal adversarial triggers (UATs). We propose a new white-box approach to the construction of layerwise UATs (LUATs), which searches the triggers by perturbing hidden layers of a network. On the example of three transformer models and three datasets from the GLUE benchmark, we demonstrate that our method provides better transferability in a model-to-model setting with an average gain of 9.3% in the fooling rate over the baseline. Moreover, we investigate triggers transferability in the task-to-task setting. Using small subsets from the datasets similar to the target tasks for choosing a perturbed layer, we show that LUATs are more efficient than vanilla UATs by 7.1% in the fooling rate.

pdf bib
Efficient GPT Model Pre-training using Tensor Train Matrix Representation
Viktoriia Chekalina | Georgiy Novikov | Julia Gusak | Alexander Panchenko | Ivan Oseledets
Proceedings of the 37th Pacific Asia Conference on Language, Information and Computation


pdf bib
Tensorized Embedding Layers
Oleksii Hrinchuk | Valentin Khrulkov | Leyla Mirvakhabova | Elena Orlova | Ivan Oseledets
Findings of the Association for Computational Linguistics: EMNLP 2020

The embedding layers transforming input words into real vectors are the key components of deep neural networks used in natural language processing. However, when the vocabulary is large, the corresponding weight matrices can be enormous, which precludes their deployment in a limited resource setting. We introduce a novel way of parameterizing embedding layers based on the Tensor Train decomposition, which allows compressing the model significantly at the cost of a negligible drop or even a slight gain in performance. We evaluate our method on a wide range of benchmarks in natural language processing and analyze the trade-off between performance and compression ratios for a wide range of architectures, from MLPs to LSTMs and Transformers.


pdf bib
Riemannian Optimization for Skip-Gram Negative Sampling
Alexander Fonarev | Oleksii Grinchuk | Gleb Gusev | Pavel Serdyukov | Ivan Oseledets
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Skip-Gram Negative Sampling (SGNS) word embedding model, well known by its implementation in “word2vec” software, is usually optimized by stochastic gradient descent. However, the optimization of SGNS objective can be viewed as a problem of searching for a good matrix with the low-rank constraint. The most standard way to solve this type of problems is to apply Riemannian optimization framework to optimize the SGNS objective over the manifold of required low-rank matrices. In this paper, we propose an algorithm that optimizes SGNS objective using Riemannian optimization and demonstrates its superiority over popular competitors, such as the original method to train SGNS and SVD over SPPMI matrix.