Seitaro Shinagawa

2024

pdf bib abs
The Gap in the Strategy of Recovering Task Failure between GPT-4V and Humans in a Visual Dialogue
Ryosuke Oshima | Seitaro Shinagawa | Shigeo Morishima
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue

Goal-oriented dialogue systems interact with humans to accomplish specific tasks. However, sometimes these systems fail to establish a common ground with users, leading to task failures. In such cases, it is crucial not to just end with failure but to correct and recover the dialogue to turn it into a success for building a robust goal-oriented dialogue system. Effective recovery from task failures in a goal-oriented dialogue involves not only successful recovery but also accurately understanding the situation of the failed task to minimize unnecessary interactions and avoid frustrating the user. In this study, we analyze the capabilities of GPT-4V in recovering failure tasks by comparing its performance with that of humans using Guess What?! Game. The results show that GPT-4V employs less efficient recovery strategies, such as asking additional unnecessary questions, than humans. We also found that while humans can occasionally ask questions that doubt the accuracy of the interlocutor’s answer during task recovery, GPT-4V lacks this capability.

2020

pdf bib abs
Emotional Speech Corpus for Persuasive Dialogue System
Sara Asai | Koichiro Yoshino | Seitaro Shinagawa | Sakriani Sakti | Satoshi Nakamura
Proceedings of the Twelfth Language Resources and Evaluation Conference

Expressing emotion is known as an efficient way to persuade one’s dialogue partner to accept one’s claim or proposal. Emotional expression in speech can express the speaker’s emotion more directly than using only emotion expression in the text, which will lead to a more persuasive dialogue. In this paper, we built a speech dialogue corpus in a persuasive scenario that uses emotional expressions to build a persuasive dialogue system with emotional expressions. We extended an existing text dialogue corpus by adding variations of emotional responses to cover different combinations of broad dialogue context and a variety of emotional states by crowd-sourcing. Then, we recorded emotional speech consisting of of collected emotional expressions spoken by a voice actor. The experimental results indicate that the collected emotional expressions with their speeches have higher emotional expressiveness for expressing the system’s emotion to users.

Co-authors

Koichiro Yoshino 1

Venues

lrec1
sigdial1

Fix author