Conversational AI for Virtual Standardized Patients using a Speech-to-Speech LLM

Andrew Emerson, Keelan Evanini, Su Somay, Kevin Frome, Le An Ha, Polina Harik


Abstract
To develop clinical reasoning skills, medical students are often tasked with interacting with trained standardized patients (SPs). Human SPs enable real conversations that can resemble authentic clinical scenarios. However, human SPs require extensive training and are often limited in their accessibility and continual availability to medical students or residents. Virtual SPs offer the ability for medical students to practice clinical interviews in a lower-stakes setting across a broader set of clinical cases. This paper introduces a virtual SP (VSP) that leverages Amazon’s Nova Sonic, a speech-to-speech foundation model designed for human-like conversation. We investigated the ability of Nova Sonic to portray four distinct clinical cases in virtual doctor-patient encounters with 20 third-year medical students. The system’s realism, its perceived learning value, and user experience were all assessed via a survey administered to the students. Students were also asked to compare this experience to interactions with a human SP. Survey results and conversations were analyzed to derive insights for improving the Nova Sonic-based VSP system.
Anthology ID:
2026.iwsds-1.16
Volume:
Proceedings of the 16th International Workshop on Spoken Dialogue System Technology
Month:
February
Year:
2026
Address:
Trento, Italy
Editors:
Giuseppe Riccardi, Seyed Mahed Mousavi, Maria Ines Torres, Koichiro Yoshino, Zoraida Callejas, Shammur Absar Chowdhury, Yun-Nung Chen, Frederic Bechet, Joakim Gustafson, Géraldine Damnati, Alex Papangelis, Luis Fernando D’Haro, John Mendonça, Raffaella Bernardi, Dilek Hakkani-Tur, Giuseppe "Pino" Di Fabbrizio, Tatsuya Kawahara, Firoj Alam, Gokhan Tur, Michael Johnston
Venue:
IWSDS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
142–152
Language:
URL:
https://aclanthology.org/2026.iwsds-1.16/
DOI:
Bibkey:
Cite (ACL):
Andrew Emerson, Keelan Evanini, Su Somay, Kevin Frome, Le An Ha, and Polina Harik. 2026. Conversational AI for Virtual Standardized Patients using a Speech-to-Speech LLM. In Proceedings of the 16th International Workshop on Spoken Dialogue System Technology, pages 142–152, Trento, Italy. Association for Computational Linguistics.
Cite (Informal):
Conversational AI for Virtual Standardized Patients using a Speech-to-Speech LLM (Emerson et al., IWSDS 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.iwsds-1.16.pdf