Exploring Emotional Nuances in Spoken Dialogue: Dataset Construction and Prediction of Emotional Dialogue Breakdown

Hyuga Nakaguro; Koichiro Yoshino

Exploring Emotional Nuances in Spoken Dialogue: Dataset Construction and Prediction of Emotional Dialogue Breakdown

Abstract

In spoken dialogue systems, even when the utterance text is the same, speaking style or tone differences can change its nuance. To respond appropriately in such cases, systems must accurately interpret paralinguistic information. Our study evaluates such a system’s ability using the "paraling-dial" dataset, which pairs a fixed utterance text with five distinct emotional expressions and their corresponding responses. We define a task using this dataset that detects mismatches—referred to as emotional dialogue breakdowns—between the expressed emotion of an utterance and the content of its response. We propose a breakdown detection system based on the Feature-wise Linear Modulation (FiLM) model, under the hypothesis that emotion dynamically controls text interpretation. Our experimental results show that the proposed model achieves 93.8% accuracy with gold emotion labels and 91.2% with predicted labels, demonstrating both its effectiveness and practicality. We also compare different types of control signals to identify the level of information required for such a breakdown detection task: emotion labels, emotion embeddings, and acoustic features. The results suggest that the appropriate level of abstraction, rather than simply richer information, is crucial for designing effective control signals.

Anthology ID:: 2026.iwsds-1.9
Volume:: Proceedings of the 16th International Workshop on Spoken Dialogue System Technology
Month:: February
Year:: 2026
Address:: Trento, Italy
Editors:: Giuseppe Riccardi, Seyed Mahed Mousavi, Maria Ines Torres, Koichiro Yoshino, Zoraida Callejas, Shammur Absar Chowdhury, Yun-Nung Chen, Frederic Bechet, Joakim Gustafson, Géraldine Damnati, Alex Papangelis, Luis Fernando D’Haro, John Mendonça, Raffaella Bernardi, Dilek Hakkani-Tur, Giuseppe "Pino" Di Fabbrizio, Tatsuya Kawahara, Firoj Alam, Gokhan Tur, Michael Johnston
Venue:: IWSDS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 95–103
Language:
URL:: https://aclanthology.org/2026.iwsds-1.9/
DOI:
Bibkey:
Cite (ACL):: Hyuga Nakaguro and Koichiro Yoshino. 2026. Exploring Emotional Nuances in Spoken Dialogue: Dataset Construction and Prediction of Emotional Dialogue Breakdown. In Proceedings of the 16th International Workshop on Spoken Dialogue System Technology, pages 95–103, Trento, Italy. Association for Computational Linguistics.
Cite (Informal):: Exploring Emotional Nuances in Spoken Dialogue: Dataset Construction and Prediction of Emotional Dialogue Breakdown (Nakaguro & Yoshino, IWSDS 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.iwsds-1.9.pdf

PDF Cite Search Fix data