2025
pdf
bib
abs
DSLCMM: A Multimodal Human-Machine Dialogue Corpus Built through Competitions
Ryuichiro Higashinaka
|
Tetsuro Takahashi
|
Shinya Iizuka
|
Sota Horiuchi
|
Michimasa Inaba
|
Zhiyang Qi
|
Yuta Sasaki
|
Kotaro Funakoshi
|
Shoji Moriya
|
Shiki Sato
|
Takashi Minato
|
Kurima Sakai
|
Tomo Funayama
|
Masato Komuro
|
Hiroyuki Nishikawa
|
Ryosaku Makino
|
Hirofumi Kikuchi
|
Mayumi Usami
Proceedings of the 15th International Workshop on Spoken Dialogue Systems Technology
A corpus of dialogues between multimodal systems and humans is indispensable for the development and improvement of such systems. However, there is a shortage of human-machine multimodal dialogue datasets, which hinders the widespread deployment of these systems in society. To address this issue, we construct a Japanese multimodal human-machine dialogue corpus, DSLCMM, by collecting and organizing data from the Dialogue System Live Competitions (DSLCs). This paper details the procedure for constructing the corpus and presents our analysis of the relationship between various dialogue features and evaluation scores provided by users.
pdf
bib
abs
Analyzing Dialogue System Behavior in a Specific Situation Requiring Interpersonal Consideration
Tetsuro Takahashi
|
Hirofumi Kikuchi
|
Jie Yang
|
Hiroyuki Nishikawa
|
Masato Komuro
|
Ryosaku Makino
|
Shiki Sato
|
Yuta Sasaki
|
Shinji Iwata
|
Asahi Hentona
|
Takato Yamazaki
|
Shoji Moriya
|
Masaya Ohagi
|
Zhiyang Qi
|
Takashi Kodama
|
Akinobu Lee
|
Takashi Minato
|
Kurima Sakai
|
Tomo Funayama
|
Kotaro Funakoshi
|
Mayumi Usami
|
Michimasa Inaba
|
Ryuichiro Higashinaka
Proceedings of the 26th Annual Meeting of the Special Interest Group on Discourse and Dialogue
In human-human conversation, interpersonal consideration for the interlocutor is essential, and similar expectations are increasingly placed on dialogue systems. This study examines the behavior of dialogue systems in a specific interpersonal scenario where a user vents frustrations and seeks emotional support from a long-time friend represented by a dialogue system. We conducted a human evaluation and qualitative analysis of 15 dialogue systems under this setting. These systems implemented diverse strategies, such as structuring dialogue into distinct phases, modeling interpersonal relationships, and incorporating cognitive behavioral therapy techniques. Our analysis reveals that these approaches contributed to improved perceived empathy, coherence, and appropriateness, highlighting the importance of design choices in socially sensitive dialogue.
pdf
bib
abs
How Stylistic Similarity Shapes Preferences in Dialogue Dataset with User and Third Party Evaluations
Ikumi Numaya
|
Shoji Moriya
|
Shiki Sato
|
Reina Akama
|
Jun Suzuki
Proceedings of the 26th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Recent advancements in dialogue generation have broadened the scope of human–bot interactions, enabling not only contextually appropriate responses but also the analysis of human affect and sensitivity. While prior work has suggested that stylistic similarity between user and system may enhance user impressions, the distinction between subjective and objective similarity is often overlooked. To investigate this issue, we introduce a novel dataset that includes users’ preferences, subjective stylistic similarity based on users’ own perceptions, and objective stylistic similarity annotated by third party evaluators in open-domain dialogue settings. Analysis using the constructed dataset reveals a strong positive correlation between subjective stylistic similarity and user preference. Furthermore, our analysis suggests an important finding: users’ subjective stylistic similarity differs from third party objective similarity. This underscores the importance of distinguishing between subjective and objective evaluations and understanding the distinct aspects each captures when analyzing the relationship between stylistic similarity and user preferences. The dataset presented in this paper is available online.
pdf
bib
abs
Key Challenges in Multimodal Task-Oriented Dialogue Systems: Insights from a Large Competition-Based Dataset
Shiki Sato
|
Shinji Iwata
|
Asahi Hentona
|
Yuta Sasaki
|
Takato Yamazaki
|
Shoji Moriya
|
Masaya Ohagi
|
Hirofumi Kikuchi
|
Jie Yang
|
Zhiyang Qi
|
Takashi Kodama
|
Akinobu Lee
|
Masato Komuro
|
Hiroyuki Nishikawa
|
Ryosaku Makino
|
Takashi Minato
|
Kurima Sakai
|
Tomo Funayama
|
Kotaro Funakoshi
|
Mayumi Usami
|
Michimasa Inaba
|
Tetsuro Takahashi
|
Ryuichiro Higashinaka
Proceedings of the 26th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Challenges in multimodal task-oriented dialogue between humans and systems, particularly those involving audio and visual interactions, have not been sufficiently explored or shared, forcing researchers to define improvement directions individually without a clearly shared roadmap. To address these challenges, we organized a competition for multimodal task-oriented dialogue systems and constructed a large competition-based dataset of 1,865 minutes of Japanese task-oriented dialogues. This dataset includes audio and visual interactions between diverse systems and human participants. After analyzing system behaviors identified as problematic by the human participants in questionnaire surveys and notable methods employed by the participating teams, we identified key challenges in multimodal task-oriented dialogue systems and discussed potential directions for overcoming these challenges.
2024
pdf
bib
abs
A Multimodal Dialogue System to Lead Consensus Building with Emotion-Displaying
Shinnosuke Nozue
|
Yuto Nakano
|
Shoji Moriya
|
Tomoki Ariyama
|
Kazuma Kokuta
|
Suchun Xie
|
Kai Sato
|
Shusaku Sone
|
Ryohei Kamei
|
Reina Akama
|
Yuichiroh Matsubayashi
|
Keisuke Sakaguchi
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue
The evolution of large language models has enabled fluent dialogue, increasing interest in the coexistence of humans and avatars. An essential aspect of achieving this coexistence involves developing sophisticated dialogue systems that can influence user behavior. In this background, we propose an effective multimodal dialogue system designed to promote consensus building with humans. Our system employs a slot-filling strategy to guide discussions and attempts to influence users with suggestions through emotional expression and intent conveyance via its avatar. These innovations have resulted in our system achieving the highest performance in a competition evaluating consensus building between humans and dialogue systems. We hope that our research will promote further discussion on the development of dialogue systems that enhance consensus building in human collaboration.
2023
pdf
bib
abs
TohokuNLP at SemEval-2023 Task 5: Clickbait Spoiling via Simple Seq2Seq Generation and Ensembling
Hiroto Kurita
|
Ikumi Ito
|
Hiroaki Funayama
|
Shota Sasaki
|
Shoji Moriya
|
Ye Mengyu
|
Kazuma Kokuta
|
Ryujin Hatakeyama
|
Shusaku Sone
|
Kentaro Inui
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
This paper describes our system submitted to SemEval-2023 Task 5: Clickbait Spoiling. We work on spoiler generation of the subtask 2 and develop a system which comprises two parts: 1) simple seq2seq spoiler generation and 2) post-hoc model ensembling. Using this simple method, we address the challenge of generating multipart spoiler. In the test set, our submitted system outperformed the baseline by a large margin (approximately 10 points above on the BLEU score) for mixed types of spoilers. We also found that our system successfully handled the challenge of the multipart spoiler, confirming the effectiveness of our approach.