Proceedings of the 21st Workshop of Young Researchers' Roundtable on Spoken Dialogue Systems

Ryan Whetten, Virgile Sucal, Anh Ngo, Kranti Chalamalasetti, Koji Inoue, Gaetano Cimino, Zachary Yang, Yuki Zenimoto, Ricardo Rodriguez (Editors)

Anthology ID:: 2025.yrrsds-1
Month:: August
Year:: 2025
Address:: Avignon, France
Venue:: YRRSDS
SIG:
Publisher:: Association for Computational Linguistics
URL:: https://aclanthology.org/2025.yrrsds-1/
DOI:
Bib Export formats:: BibTeX MODS XML EndNote

BibTeX Search

pdf bib abs
Research on LLMs-Empowered Conversational AI for Sustainable Behaviour Change
Ben Chen

This is a position paper for my research, including my research interests, my views about Spoken dialogue system (SDS) research and suggested topics for discussion.

pdf bib abs
Deep Reinforcement Learning of LLMs using RLHF
Enoch Levandovsky

My main research interests lies in the application of Reinforcement Learning (RL) alignment of LLMs in human robot dialogue. More specifically, my latest research aims to use RL alignment as an efficient training regime to train a newly initialized tiny LM to behave like a toddler. Previous research expresses the difficulty of building a robust tiny LM with an educated adult level understanding. Our hypothesis is that the cognitive barrier to train a tiny LM to at-least behave as a child is achievable with a very small number of parameters especially if training efficiently using RL LLM training regime. My interests also extend to apply RL to LLM training for dialogue management and planning.

pdf bib abs
Conversational Collaborative Robots
Chalamalasetti Kranti

Spoken dialogue systems (SDSs) aims to enable natural, interactive and collaborative conversations. My research interest lies in leveraging these situated collaborative conversations to teach new concepts (skills) to collaborative robots (cobots). These cobots, when operating in manufacturing environments such as assembly lines, are envisioned to converse with humans, reach common ground, and learn new skills in one shot without the need for multiple demonstrations. Unlike SDSs in consumer domains, these cobot-based systems must handle conversations in noisy, time-sensitive industrial settings. Motivated by these challenges, my research focuses on building collaborative dialogue systems capable of integrating conversational programming to translate situated dialogue into modular programs, knowing when to ask for clarifications, and adapting the program based on corrections.

pdf bib abs
Dialogue System using Large Language Model-based Dynamic Slot Generation
Ekai Hashimoto

In this position paper, I present my research interests in dialogue systems that elicit user career-related information. My work centres on two aspects. First, I seek to enhance the information-gathering capability of task-oriented systems by using large language models (LLMs) to generate slots dynamically, enabling the system to ask for deeper career details, such as reasons for leaving a job. Second, I propose a method—planned for future study—that decomposes and recomposes system questions along a “depth” axis so that sensitive information can be obtained more naturally. Finally, I discuss the positive and negative implications of combining LLMs with spoken dialogue systems (SDSs) and consider how SDS technology will interact with society.

pdf bib abs
Towards Adaptive Human-Agent Collaboration in Real-Time Environments
Kaito Nakae

My research interests lie in human-agent collaboration and user adaptation, with a particular emphasis on adaptation in real-time collaborative environments.The field of collaborative systems aims to support human teams in completing complex tasks efficiently while ensuring natural and adaptive interaction experiences.I investigate how AI agents can function as effective partners by adapting to their human collaborators.A central focus of my research is the personalization of agent behavior based on user proficiency.This includes methods for adapting the agent’s communication strategies according to the user’s skill level and task experience. To pursue this goal, I collected and analyzed a multimodal dataset of human-human interaction using a real-time collaborative cooking game environment (Wu et al., 2021; Liu et al., 2024).The chosen environment is characterized by its complex task mechanics and strict time constraints, which necessitate seamless coordination and elicit dynamic, natural collaborative behaviors such as role negotiation and error recovery.Through this analysis, I investigated how partners with different levels of task proficiency communicate and coordinate effectively.Based on the findings, I proposed practical design guidelines for future adaptive AI agents, enabling them to adjust their level of guidance and initiative in response to the user’s proficiency.

pdf bib abs
Towards Human-Like Dialogue Systems: Integrating Multimodal Emotion Recognition and Non-Verbal Cue Generation
Jingjing Jiang

This position paper outlines my research vision for developing human-like dialogue systems capable of both perceiving and expressing emotions through multimodal communication. My current research focuses on two main areas: multimodal emotion recognition and non-verbal cue generation. For emotion recognition, I constructed a Japanese multimodal dialogue dataset that captures natural, dyadic face-to-face interactions and developed an emotional valence recognition model that integrates textual, speech and physiological inputs. On the generation side, my research explores non-verbal cue generation for embodied conversational agents (ECAs). Finally, the paper discusses the future of SDSs, emphasizing the shift from traditional turn-based architectures to full-duplex, real-time, multimodal systems.

pdf bib abs
Controlling Dialogue Systems with Graph-Based Structures
Laetitia Mina Hilgendorf

Large Language Models (LLMs) have significantly advanced the capabilities of dialogue systems, yet they often lack controllability and consistency. My research investigates how explicit structure can be used to guide LLM-based dialogue systems, focusing in particular on graph-based methods. One line of work explores the use of dialogue flow graphs to represent possible user and system actions, enabling systems to constrain generation to goal-directed paths. These graphs serve as an interpretable interface between high-level dialogue policy and low-level natural language output, improving reliability and transparency. In parallel, I examine Retrieval-Augmented Generation (RAG) approaches that leverage knowledge graphs to ground responses in structured background information. I have evaluated how GraphRAG performs on dialogue data and contributed to methods for retrieving compact, relevant subgraphs to support contextually appropriate and verifiable responses. These approaches address the limitations of unguided retrieval and help integrate external knowledge into the generation process more effectively. Together, these directions aim to improve the controllability, grounding, and robustness of LLM-based dialogue systems. I am particularly interested in how graph-based representations can be used not only to structure knowledge, but also to inform and constrain interaction patterns.

pdf bib abs
Multimodal Agentic Dialogue Systems for Situated Human-Robot Interaction
Virgile Sucal

This position paper presents the integration of dialogue systems into situated robotics, emphasizing the use of contextual information—particularly audiovisual perceptions—to inform dialogue policies. A central objective is the development of interaction policies that dynamically select contextually appropriate actions aligned with the user’s intentions and needs. The works presented in this paper explore proactive decision-making mechanisms in multimodal interaction settings and seek to enhance robotic expressiveness through nonverbal communication cues. Current efforts focus on evaluating and comparing approaches such as agentic workflows and reinforcement learning within a unified framework, aiming to facilitate more consistent and contextually aware human–robot interaction.

pdf bib abs
Knowledge Graphs and Representational Models for Dialogue Systems
Nicholas Thomas Walker

I am interested graph-based dialogue management for dialogue systems, specifically the use of knowledge- graphs. Representations of knowledge combining in- formation about the world with dialogue or user-specific information, such as personal knowledge graphs (Balog and Kenter, 2019) are of particular interest to me. Knowl- edge graphs have the flexibility to represent diverse in- formation such as dialogue specific information, gen- eral world knowledge, and even situated knowledge in the case of embodied dialogue systems. Much of my previous work has investigated knowledge graphs in an HRI context that combined these attributes (Walker et al., 2022b).