Towards Human-Like Dialogue Systems: Integrating Multimodal Emotion Recognition and Non-Verbal Cue Generation

Jingjing Jiang

Towards Human-Like Dialogue Systems: Integrating Multimodal Emotion Recognition and Non-Verbal Cue Generation

Abstract

This position paper outlines my research vision for developing human-like dialogue systems capable of both perceiving and expressing emotions through multimodal communication. My current research focuses on two main areas: multimodal emotion recognition and non-verbal cue generation. For emotion recognition, I constructed a Japanese multimodal dialogue dataset that captures natural, dyadic face-to-face interactions and developed an emotional valence recognition model that integrates textual, speech and physiological inputs. On the generation side, my research explores non-verbal cue generation for embodied conversational agents (ECAs). Finally, the paper discusses the future of SDSs, emphasizing the shift from traditional turn-based architectures to full-duplex, real-time, multimodal systems.

Anthology ID:: 2025.yrrsds-1.6
Volume:: Proceedings of the 21st Workshop of Young Researchers' Roundtable on Spoken Dialogue Systems
Month:: August
Year:: 2025
Address:: Avignon, France
Editors:: Ryan Whetten, Virgile Sucal, Anh Ngo, Kranti Chalamalasetti, Koji Inoue, Gaetano Cimino, Zachary Yang, Yuki Zenimoto, Ricardo Rodriguez
Venue:: YRRSDS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 15–17
Language:
URL:: https://aclanthology.org/2025.yrrsds-1.6/
DOI:
Bibkey:
Cite (ACL):: Jingjing Jiang. 2025. Towards Human-Like Dialogue Systems: Integrating Multimodal Emotion Recognition and Non-Verbal Cue Generation. In Proceedings of the 21st Workshop of Young Researchers' Roundtable on Spoken Dialogue Systems, pages 15–17, Avignon, France. Association for Computational Linguistics.
Cite (Informal):: Towards Human-Like Dialogue Systems: Integrating Multimodal Emotion Recognition and Non-Verbal Cue Generation (Jiang, YRRSDS 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.yrrsds-1.6.pdf

PDF Cite Search Fix data