Context-Aware Language Understanding in Human-Robot Dialogue with LLMs

Svetlana Stoyanchev, Youmna Farag, Simon Keizer, Mohan Li, Rama Sanand Doddipatla


Abstract
In this work, we explore the use of large language models (LLMs) as interpreters of user utterances within a human-robot language interface. A user interacting with a robot that operates in a physical environment should be able to issue commands that interrupt the robot’s actions, for example, corrections or refinement of the task. This study addresses the context-aware interpretation of user utterances, including those issued while the robot is actively engaged in task execution, exploring whether LLMs, without fine-tuning, can translate user commands into corresponding sequences of robot actions. Using an interactive multimodal interface—combining text and video—for a virtual robot operating in simulated home environments, we collect a dataset of user utterances that guide the robot through various household tasks simultaneously capturing manual interpretation when the automatic one fails. Driven by practical considerations, the collected dataset is used to compare the interpretive performance of GPT models with smaller publicly available alternatives. Our findings reveal that action-interrupting utterances pose challenges for all models. While GPT consistently outperforms the smaller models, interpretation accuracy improves across the board when relevant dynamically selected in-context learning examples are included in the prompt.
Anthology ID:
2026.iwsds-1.27
Volume:
Proceedings of the 16th International Workshop on Spoken Dialogue System Technology
Month:
February
Year:
2026
Address:
Trento, Italy
Editors:
Giuseppe Riccardi, Seyed Mahed Mousavi, Maria Ines Torres, Koichiro Yoshino, Zoraida Callejas, Shammur Absar Chowdhury, Yun-Nung Chen, Frederic Bechet, Joakim Gustafson, Géraldine Damnati, Alex Papangelis, Luis Fernando D’Haro, John Mendonça, Raffaella Bernardi, Dilek Hakkani-Tur, Giuseppe "Pino" Di Fabbrizio, Tatsuya Kawahara, Firoj Alam, Gokhan Tur, Michael Johnston
Venue:
IWSDS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
262–274
Language:
URL:
https://aclanthology.org/2026.iwsds-1.27/
DOI:
Bibkey:
Cite (ACL):
Svetlana Stoyanchev, Youmna Farag, Simon Keizer, Mohan Li, and Rama Sanand Doddipatla. 2026. Context-Aware Language Understanding in Human-Robot Dialogue with LLMs. In Proceedings of the 16th International Workshop on Spoken Dialogue System Technology, pages 262–274, Trento, Italy. Association for Computational Linguistics.
Cite (Informal):
Context-Aware Language Understanding in Human-Robot Dialogue with LLMs (Stoyanchev et al., IWSDS 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.iwsds-1.27.pdf