Vladislav Maraev

2026

When social robots see our sketches: evaluating human perception of a robot and a VLM model performance in a drawing task
Viktoria Paraskevi Daniilidou | Nikolai Ilinykh | Vladislav Maraev
Proceedings of the 16th International Workshop on Spoken Dialogue System Technology

We introduce a multimodal framework for interactive drawing in a robot-assisted second language learning scenario. In this scenario, humans are asked to draw objects and spatial relations between them, while a social robot that uses a vision-language model (VLM) to analyse whether the drawings are correct.The correctness decision that is passed to the human is coming from a Wizard-of-Oz (WoZ) setup. Therefore, we use it to indirectly evaluate the quality of VLM predictions. We show that the task is very challenging for a VLM and approaching evaluation of VLM performance is important: focusing on the correctness of prediction of certain features (objects, relations) provides a different evaluation picture from when the model is evaluated on prediction of the content of the image as a whole. We also examine how the appearance of the social agent and the type of feedback influence perception of the agent by the participants through a questionnaire. The comparison of verbal feedback, generated by the large language models, against simple pattern-based feedback did not show any significant effects whereas the robot’s appearance change indicated significant difference in user ratings concerning naturalness of the agent and its social presence.

2025

pdf bib abs

Combining Information State Update, Harel Statecharts and LLMs for controllable and flexible Conversational AI
Vladislav Maraev | Alexander Berman | Staffan Larsson
Proceedings of the 2025 CLASP Conference on Language models And RePresentations (LARP)

The rise of LLM-based approaches to dialogue systems has created an increased need for controllable dialogue. This paper addresses this need by presenting an implementation of a dialogue system based on information state update approach according to Larsson (2002). This enables the integration of rule-based handling of dialogue, expressed by Harel’s statecharts (1987), and Larsson’s theoretical account grounded in theories of dialogue, expressed by information state update rules. We demonstrate how our approach applies to dialogue domains involving form-filling. We also propose how LLMs can be employed to inject domain knowledge and be used in various components of a hybrid dialogue system, while maintaining control over the overall dialogue logic.

2024

pdf bib

Proceedings of the 2024 CLASP Conference on Multimodality and Interaction in Language Learning
Amy Qiu | Bill Noble | David Pagmar | Vladislav Maraev | Nikolai Ilinykh
Proceedings of the 2024 CLASP Conference on Multimodality and Interaction in Language Learning

2023

pdf bib abs

Because is why: Children’s acquisition of topoi through why questions
Christine Howes | Ellen Breitholtz | Vladislav Maraev
Proceedings of the 2023 CLASP Conference on Learning with Small Data (LSD)

In this paper we look at how children learn the underlying principles of commonsense reasoning, sometimes referred to as topoi, which are prevalent in everyday dialogue. By examining the utterances of two children in the CHILDES corpus for whom there is extensive longitudinal data, we show how children can elicit topoi from their parents by asking why-questions. This strategy for the rapid acquisition of topoi peaks at around age three, suggesting that it is a critical step in becoming a fully competent language user.

2022

pdf bib abs

In this paper we examine different meaning representations that are commonly used in different natural language applications today and discuss their limits, both in terms of the aspects of the natural language meaning they are modelling and in terms of the aspects of the application for which they are used.

2021

pdf bib

Can the Transformer Learn Nested Recursion with Symbol Masking?
Jean-Philippe Bernardy | Adam Ek | Vladislav Maraev
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib abs

Large-scale text pre-training helps with dialogue act recognition, but not without fine-tuning
Bill Noble | Vladislav Maraev
Proceedings of the 14th International Conference on Computational Semantics (IWCS)

We use dialogue act recognition (DAR) to investigate how well BERT represents utterances in dialogue, and how fine-tuning and large-scale pre-training contribute to its performance. We find that while both the standard BERT pre-training and pretraining on dialogue-like data are useful, task-specific fine-tuning is essential for good performance.

pdf bib abs

Why Should I Turn Left? Towards Active Explainability for Spoken Dialogue Systems.
Vladislav Maraev | Ellen Breitholtz | Christine Howes | Jean-Philippe Bernardy
Proceedings of the Reasoning and Interaction Conference (ReInAct 2021)

In this paper we argue that to make dialogue systems able to actively explain their decisions they can make use of enthymematic reasoning. We motivate why this is an appropriate strategy and integrate it within our own proof-theoretic dialogue manager framework based on linear logic. In particular, this enables a dialogue system to provide reasonable answers to why-questions that query information previously given by the system.

This paper presents the results of systematic experimentation on the impact in duplicate question detection of different types of questions across both a number of established approaches and a novel, superior one used to address this language processing task. This study permits to gain a novel insight on the different levels of robustness of the diverse detection methods with respect to different conditions of their application, including the ones that approximate real usage scenarios.

Venues

*SEM1

TAL1

WS1

Fix author