Michimasa Inaba

2024

We held our 6th annual AIWolf international contest to automatically play the Werewolf game “Mafia”, where players try finding liars via conversations, aiming at promoting developments in creating agents of more natural conversations in higher level, such as longer contexts, personal relationships, semantics, pragmatics, and logics, revealing the capabilities and limits of the generative AIs. In our Natural Language Division of the contest, we had eight Japanese speaking agent teams, and five English speaking agents, to mutually run games. By using the game logs, we performed human subjective evaluations, win rates, and detailed log analysis. We found that the entire system performance has largely improved over the previous year, due to the recent advantages of the LLMs. There are several new ideas to improve the way using LLMs such as the summarization, characterization, and the logics outside LLMs, etc. However, it is not perfect at all yet; the generated talks are sometimes inconsistent with the game actions. Our future work includes to reveal the capability of the LLMs, whether they can make the duality of the “liar”, in other words, holding a “true” and a “false” circumstances of the agent at the same time, even holding what these circumstances look like from other agents.

pdf bib abs
Enhancing Dialogue Generation in Werewolf Game Through Situation Analysis and Persuasion Strategies
Zhiyang Qi | Michimasa Inaba
Proceedings of the 2nd International AIWolfDial Workshop

Recent advancements in natural language processing, particularly with large language models (LLMs) like GPT-4, have significantly enhanced dialogue systems, enabling them to generate more natural and fluent conversations. Despite these improvements, challenges persist, such as managing continuous dialogues, memory retention, and minimizing hallucinations. The AIWolfDial2024 addresses these challenges by employing the Werewolf Game, an incomplete information game, to test the capabilities of LLMs in complex interactive environments. This paper introduces a LLM-based Werewolf Game AI, where each role is supported by situation analysis to aid response generation. Additionally, for the werewolf role, various persuasion strategies, including logical appeal, credibility appeal, and emotional appeal, are employed to effectively persuade other players to align with its actions.

The Werewolf Game is a communication game where players’ reasoning and discussion skills are essential. In this study, we present a Werewolf AI agent developed for the AIWolfDial 2024 shared task, co-hosted with the 17th INLG. In recent years, large language models like ChatGPT have garnered attention for their exceptional response generation and reasoning capabilities. We thus develop the LLM-based agents for the Werewolf Game. This study aims to enhance the consistency of the agent’s utterances by utilizing dialogue summaries generated by LLMs and manually designed personas and utterance examples. By analyzing self-match game logs, we demonstrate that the agent’s utterances are contextually consistent and that the character, including tone, is maintained throughout the game.

pdf bib abs
Data Augmentation Integrating Dialogue Flow and Style to Adapt Spoken Dialogue Systems to Low-Resource User Groups
Zhiyang Qi | Michimasa Inaba
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue

This study addresses the interaction challenges encountered by spoken dialogue systems (SDSs) when engaging with users who exhibit distinct conversational behaviors, particularly minors, in scenarios where data are scarce. We propose a novel data augmentation framework to enhance SDS performance for user groups with limited resources. Our approach leverages a large language model (LLM) to extract speaker styles and a pre-trained language model (PLM) to simulate dialogue act history. This method generates enriched and personalized dialogue data, facilitating improved interactions with unique user demographics. Extensive experiments validate the efficacy of our methodology, highlighting its potential to foster the development of more adaptive and inclusive dialogue systems.

pdf bib abs
Interactive Dialogue Interface for Personalized News Article Comprehension
Tomoya Higuchi | Michimasa Inaba
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue

We developed an interface to explain news articles through dialogue by considering the user’s comprehension level. The interface generates several pertinent questions based on the ongoing dialogue and news article, and users advance the conversation by selecting a question. Based on the user’s selected questions, the interface estimates their comprehension level of the news article and adjusts the difficulty of the generated questions accordingly. This enables a personalized dialogue tailored to each user’s comprehension needs. The results of the baseline comparison experiments confirmed the usefulness of the interface.

pdf bib abs
User Review Writing via Interview with Dialogue Systems
Yoshiki Tanaka | Michimasa Inaba
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue

User reviews on e-commerce and review sites are crucial for making purchase decisions, although creating detailed reviews is time-consuming and labor-intensive. In this study, we propose a novel use of dialogue systems to facilitate user review creation by generating reviews from information gathered during interview dialogues with users. To validate our approach, we implemented our system using GPT-4 and conducted comparative experiments from the perspectives of system users and review readers. The results indicate that participants who used our system rated their interactions positively. Additionally, reviews generated by our system required less editing to achieve user satisfaction compared to those by the baseline. We also evaluated the reviews from the readers’ perspective and found that our system-generated reviews are more helpful than those written by humans. Despite challenges with the fluency of the generated reviews, our method offers a promising new approach to review writing.

pdf bib abs
PersonaCLR: Evaluation Model for Persona Characteristics via Contrastive Learning of Linguistic Style Representation
Michimasa Inaba
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue

Persona-aware dialogue systems can improve the consistency of the system’s responses, users’ trust and user enjoyment. Filtering nonpersona-like utterances is important for constructing persona-aware dialogue systems. This paper presents the PersonaCLR model for capturing a given utterance’s intensity of persona characteristics. We trained the model with contrastive learning based on the sameness of the utterances’ speaker. Contrastive learning enables PersonaCLR to evaluate the persona characteristics of a given utterance, even if the target persona is not included in training data. For training and evaluating our model, we also constructed a new dataset of 2,155 character utterances from 100 Japanese online novels. Experimental results indicated that our model outperforms existing methods and a strong baseline using a large language model. Our source code, pre-trained model, and dataset are available at https://github.com/1never/PersonaCLR.

2023

We held our 5th annual AIWolf international contest to automatically play the Werewolf game “Mafia”, where players try finding liars via conversations, aiming at promoting developments in creating agents of more natural conversations in higher level, such as longer contexts, personal relationships, semantics, pragmatics, and logics, revealing the capabilities and limits of the generative AIs. In our Natural Language Division of the contest, we had six Japanese speaking agents from five teams, and three English speaking agents, to mutually run games. By using the game logs, We performed human subjective evaluations and detailed log analysis. We found that the entire system performance has largely improved over the previous year, due to the recent advantages of the LLMs. However, it is not perfect at all yet; the generated talks are sometimes inconsistent with the game actions, it is still doubtful that the agents could infer roles by logics rather than superficial utterance generations. It is not explicitly observed in this log but it would be still difficult to make an agent telling a lie, pretend as a villager but it has an opposite goal inside. Our future work includes to reveal the capability of the LLMs, whether they can make the duality of the “liar”, in other words, holding a “true” and a “false” circumstances of the agent at the same time, even holding what these circumstances look like from other agents.

pdf bib
Generating Character Lines in Four-Panel Manga
Michimasa Inaba
Proceedings of the 37th Pacific Asia Conference on Language, Information and Computation

pdf bib
SumRec: A Framework for Recommendation using Open-Domain Dialogue
Ryutaro Asahara | Masaki Takahashi | Chiho Iwahashi | Michimasa Inaba
Proceedings of the 37th Pacific Asia Conference on Language, Information and Computation

2022

pdf bib abs
Collection and Analysis of Travel Agency Task Dialogues with Age-Diverse Speakers
Michimasa Inaba | Yuya Chiba | Ryuichiro Higashinaka | Kazunori Komatani | Yusuke Miyao | Takayuki Nagai
Proceedings of the Thirteenth Language Resources and Evaluation Conference

When individuals communicate with each other, they use different vocabulary, speaking speed, facial expressions, and body language depending on the people they talk to. This paper focuses on the speaker’s age as a factor that affects the change in communication. We collected a multimodal dialogue corpus with a wide range of speaker ages. As a dialogue task, we focus on travel, which interests people of all ages, and we set up a task based on a tourism consultation between an operator and a customer at a travel agency. This paper provides details of the dialogue task, the collection procedure and annotations, and the analysis on the characteristics of the dialogues and facial expressions focusing on the age of the speakers. Results of the analysis suggest that the adult speakers have more independent opinions, the older speakers more frequently express their opinions frequently compared with other age groups, and the operators expressed a smile more frequently to the minor speakers.

2019

pdf bib
Proceedings of the 1st International Workshop of AI Werewolf and Dialog System (AIWolfDial2019)
Yoshinobu Kano | Claus Aranha | Michimasa Inaba | Fujio Toriumi | Hirotaka Osawa | Daisuke Katagami | Takashi Otsuki
Proceedings of the 1st International Workshop of AI Werewolf and Dialog System (AIWolfDial2019)

2018

pdf bib abs
Estimating User Interest from Open-Domain Dialogue
Michimasa Inaba | Kenichi Takahashi
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue

Dialogue personalization is an important issue in the field of open-domain chat-oriented dialogue systems. If these systems could consider their users’ interests, user engagement and satisfaction would be greatly improved. This paper proposes a neural network-based method for estimating users’ interests from their utterances in chat dialogues to personalize dialogue systems’ responses. We introduce a method for effectively extracting topics and user interests from utterances and also propose a pre-training approach that increases learning efficiency. Our experimental results indicate that the proposed model can estimate user’s interest more accurately than baseline approaches.

2016

pdf bib abs
The dialogue breakdown detection challenge: Task description, datasets, and evaluation metrics
Ryuichiro Higashinaka | Kotaro Funakoshi | Yuka Kobayashi | Michimasa Inaba
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Dialogue breakdown detection is a promising technique in dialogue systems. To promote the research and development of such a technique, we organized a dialogue breakdown detection challenge where the task is to detect a system’s inappropriate utterances that lead to dialogue breakdowns in chat. This paper describes the design, datasets, and evaluation metrics for the challenge as well as the methods and results of the submitted runs of the participants.

pdf bib
Neural Utterance Ranking Model for Conversational Dialogue Systems
Michimasa Inaba | Kenichi Takahashi
Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue

Co-authors

Venues

sigdial7
aiwolfdial3
inlg3
ws3
lrec2
show all...

paclic2

Fix data