Mikio Nakano


2025

This study aims to improve the efficiency and quality of career interviews conducted by nursing managers. To this end, we have been developing a slot-filling dialogue system that engages in pre-interview to collect information on staff careers as a preparatory step before the actual interviews. Conventional slot-filling-based interview dialogue systems have limitations in the flexibility of information collection because the dialogue progresses based on predefined slot sets. We therefore propose a method that leverages large language models (LLMs) to dynamically generate new slots according to the flow of the dialogue, achieving more natural conversations. Furthermore, we incorporate abduction into the slot generation process to enable more appropriate and effective slot generation. To validate the effectiveness of the proposed method, we conducted experiments using a user simulator. The results suggest that the proposed method using abduction is effective in enhancing both information-collecting capabilities and the naturalness of the dialogue.
In this paper, we propose a dialogue control management framework using large language models for semi-structured interviews. Specifically, large language models are used to generate the interviewer’s utterances and to make conditional branching decisions based on the understanding of the interviewee’s responses. The framework enables flexible dialogue control in interview conversations by generating and updating slots and values according to interviewee answers. More importantly, we invented through LLMs’ prompt tuning the framework of accumulating the list of slots generated along the course of incrementing the number of interviewees through the semi-structured interviews. Evaluation results showed that the proposed approach of accumulating the list of generated slots throughout the semi-structured interviews outperform the baseline without accumulating generated slots in terms of the number of persona attributes and values collected through the semi-structured interview.
To enable the broader application of dialogue system technology across various fields, it is beneficial to empower individuals with limited programming experience to build dialogue systems. Domain experts, where dialogue system technology is highly relevant, may not necessarily possess expertise in information technology. This paper presents D4AC, which works as a client for text-based dialogue servers. By combining D4AC with a no-code tool for developing text-based dialogue servers, it is possible to build multimodal dialogue systems without coding. These systems can adapt to the user’s age, gender, emotions, and engagement levels obtained from their facial images. D4AC can be installed, launched, and configured without technical knowledge. D4AC was used in student projects at a university, which suggested the effectiveness of D4AC.
This paper proposes a methodology for identifying evaluation items for practical dialogue systems. Traditionally, user satisfaction and user experiences have been the primary metrics for evaluating dialogue systems. However, there are various other evaluation items to consider when developing and operating practical dialogue systems, and such evaluation items are expected to lead to new research topics. So far, there has been no methodology for identifying these evaluation items. We propose identifying evaluation items based on business-dialogue system alignment models, which are applications of business-IT alignment models used in the development and operation of practical IT systems. We also present a generic model that facilitates the construction of a business-dialogue system alignment model for each dialogue system.
We aim to develop a library for classifying affirmative and negative user responses, intended for integration into a dialogue system development toolkit. Such a library is expected to highly perform even with minimal annotated target domain data, addressing the practical challenge of preparing large datasets for each target domain. This short paper compares several approaches under conditions where little or no annotated data is available in the target domain. One approach involves fine-tuning a pre-trained BERT model, while the other utilizes a GPT API for zero-shot or few-shot learning. Since these approaches differ in execution speed, development effort, and execution costs, in addition to performance, the results serve as a basis for discussing an appropriate configuration suited to specific requirements. Additionally, we have released the training data and the fine-tuned BERT model for Japanese affirmative/negative classification.
This paper addresses the issue of the significant labor required to test interview dialogue systems. While interview dialogue systems are expected to be useful in various scenarios, like other dialogue systems, testing them with human users requires significant effort and cost. Therefore, testing with user simulators can be beneficial. Since most conventional user simulators have been primarily designed for training task-oriented dialogue systems, little attention has been paid to the personas of the simulated users. During development, testing interview dialogue systems requires simulating a wide range of user behaviors, but manually creating a large number of personas is labor-intensive. We propose a method that automatically generates personas for user simulators using a large language model. Furthermore, by assigning personality traits related to communication styles when generating personas, we aim to increase the diversity of communication styles in the user simulator. Experimental results show that the proposed method enables the user simulator to generate utterances with greater variation.

2024

We demonstrate DialBB, a dialogue system development framework, which we have been building as an educational material for dialogue system technology. Building a dialogue system requires the adoption of an appropriate architecture depending on the application and the integration of various technologies. However, this is not easy for those who have just started learning dialogue system technology. Therefore, there is a demand for educational materials that integrate various technologies to build dialogue systems, because traditional dialogue system development frameworks were not designed for educational purposes. DialBB enables the development of dialogue systems by combining modules called building blocks. After understanding sample applications, learners can easily build simple systems using built-in blocks and can build advanced systems using their own developed blocks.

2020

For the acquisition of knowledge through dialogues, it is crucial for systems to ask questions that do not diminish the user’s willingness to talk, i.e., that do not degrade the user’s impression. This paper reports the results of our analysis on how user impression changes depending on the types of questions to acquire lexical knowledge, that is, explicit and implicit questions, and the correctness of the content of the questions. We also analyzed how sequences of the same type of questions affect user impression. User impression scores were collected from 104 participants recruited via crowdsourcing and then regression analysis was conducted. The results demonstrate that implicit questions give a good impression when their content is correct, but a bad impression otherwise. We also found that consecutive explicit questions are more annoying than implicit ones when the content of the questions is correct. Our findings reveal helpful insights for creating a strategy to avoid user impression deterioration during knowledge acquisition.

2019

2018

2017

We address the problem of acquiring the ontological categories of unknown terms through implicit confirmation in dialogues. We develop an approach that makes implicit confirmation requests with an unknown term’s predicted category. Our approach does not degrade user experience with repetitive explicit confirmations, but the system has difficulty determining if information in the confirmation request can be correctly acquired. To overcome this challenge, we propose a method for determining whether or not the predicted category is correct, which is included in an implicit confirmation request. Our method exploits multiple user responses to implicit confirmation requests containing the same ontological category. Experimental results revealed that the proposed method exhibited a higher precision rate for determining the correctly predicted categories than when only single user responses were considered.

2016

2015

2013

2012

2011

2010

2009

2008

2007

2006

2003

2000

1999

1998

1996

1994

1991