Alexander Kirillovich


2022

pdf bib
Sense-Annotated Corpus for Russian
Alexander Kirillovich | Natalia Loukachevitch | Maksim Kulaev | Angelina Bolshina | Dmitry Ilvovsky
Proceedings of the Fifth International Conference on Computational Linguistics in Bulgaria (CLIB 2022)

We present a sense-annotated corpus for Russian. The resource was obtained my manually annotating texts from the OpenCorpora corpus, an open corpus for the Russian language, by senses of Russian wordnet RuWordNet. The annotation was used as a test collection for comparing unsupervised (Personalized Pagerank) and pseudo-labeling methods for Russian word sense disambiguation.

2020

pdf bib
Controlling Chat Bot Multi-Document Navigation with the Extended Discourse Trees
Dmitry Ilvovsky | Alexander Kirillovich | Boris Galitsky
Proceedings of the Fourth International Conference on Computational Linguistics in Bulgaria (CLIB 2020)

In this paper we learn how to manage a dialogue relying on discourse of its utterances. We define extended discourse trees, introduce means to manipulate with them, and outline scenarios of multi-document navigation to extend the abilities of the interactive information retrieval-based chat bot. We also provide evaluation results of the comparison between conventional search and chat bot enriched with the multi-document navigation.

pdf bib
Cross-lingual Transfer Learning for Semantic Role Labeling in Russian
Ilseyar Alimova | Elena Tutubalina | Alexander Kirillovich
Proceedings of the Fourth International Conference on Computational Linguistics in Bulgaria (CLIB 2020)

This work is devoted to semantic role labeling (SRL) task in Russian. We investigate the role of transfer learning strategies between English FrameNet and Russian FrameBank corpora. We perform experiments with embeddings obtained from various types of multilingual language models, including BERT, XLM-R, MUSE, and LASER. For evaluation, we use a Russian FrameBank dataset. As source data for transfer learning, we experimented with the full version of FrameNet and the reduced dataset with a smaller number of semantic roles identical to FrameBank. Evaluation results demonstrate that BERT embeddings show the best transfer capabilities. The model with pretraining on the reduced English SRL data and fine-tuning on the Russian SRL data show macro-averaged F1-measure of 79.8%, which is above our baseline of 78.4%.