Konstantin Zaitsev


2025

In our research paper we present the approach that is aimed at effectively expanding the context through integrating a database of associative memory into the pipeline. In order to improve long-term memory and personalization we have utilized methods close to Retrieval-Augmented Generation (RAG). Our method uses a multi-agent pipeline with a cold-start agent for initial interactions, a fact extraction agent to process user inputs, an associative memory agent for storing and retrieving context, and a generation agent for replying to user’s queries.Evaluation results show promising results: a 41% accuracy improvement over the base Gemma3 model (from 16% to 57%). Hence, with our approach, we demonstrate that personalized chatbots can bypass LLM memory limitations while increasing information reliability under the conditions of limited context and memory.

2022

Linguistic borrowings occur in all languages. Andic languages of the Caucasus have borrowings from different donor-languages like Russian, Arabic, Persian. To automatically detect these borrowings, we propose a logistic regression model. The model was trained on the dataset which contains words in IPA from dictionaries of Andic languages. To improve model’s quality, we compared TfIdf and Count vectorizers and chose the second one. Besides, we added new features to the model. They were extracted using analysis of vectorizer features and using a language model. The model was evaluated by classification quality metrics (precision, recall and F1-score). The best average F1-score of all languages for words in IPA was about 0.78. Experiments showed that our model reaches good results not only with words in IPA but also with words in Cyrillic.