Ryuichiro Higashinaka

2025

pdf bib abs
Investigating the Impact of Incremental Processing and Voice Activity Projection on Spoken Dialogue Systems
Yuya Chiba | Ryuichiro Higashinaka
Proceedings of the 31st International Conference on Computational Linguistics

The naturalness of responses in spoken dialogue systems has been significantly improved by the introduction of large language models (LLMs), although many challenges remain until human-like turn-taking can be achieved. A turn-taking model called Voice Activity Projection (VAP) is gaining attention because it can be trained in an unsupervised manner using the spoken dialogue data between two speakers. For such a turn-taking model to be fully effective, systems must initiate response generation as soon as a turn-shift is detected. This can be achieved by incremental response generation, which reduces the delay before the system responds. Incremental response generation is done using partial speech recognition results while user speech is incrementally processed. Combining incremental response generation with VAP-based turn-taking will enable spoken dialogue systems to achieve faster and more natural turn-taking. However, their effectiveness remains unclear because they have not yet been evaluated in real-world systems. In this study, we developed spoken dialogue systems that incorporate incremental response generation and VAP-based turn-taking and evaluated their impact on task success and dialogue satisfaction through user assessments.

2024

pdf bib abs
Collecting and Analyzing Dialogues in a Tagline Co-Writing Task
Xulin Zhou | Takuma Ichikawa | Ryuichiro Higashinaka
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

The potential usage scenarios of dialogue systems will be greatly expanded if they are able to collaborate more creatively with humans. Many studies have examined ways of building such systems, but most of them focus on problem-solving dialogues, and relatively little research has been done on systems that can engage in creative collaboration with users. In this study, we designed a tagline co-writing task in which two people collaborate to create taglines via text chat, created an interface for data collection, and collected dialogue logs, editing logs, and questionnaire results. In total, we collected 782 Japanese dialogues. We describe the characteristic interactions comprising the tagline co-writing task and report the results of our analysis, in which we examined the kind of utterances that appear in the dialogues as well as the most frequent expressions found in highly rated dialogues in subjective evaluations. We also analyzed the relationship between subjective evaluations and workflow utilized in the dialogues and the interplay between taglines and utterances.

pdf bib abs
I Remember You!: SUI Corpus for Remembering and Utilizing Users’ Information in Chat-oriented Dialogue Systems
Yuiko Tsunomori | Ryuichiro Higashinaka
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

To construct a chat-oriented dialogue system that will be used for a long time by users, it is important to build a good relationship between the user and the system. To achieve a good relationship, several methods for remembering and utilizing information on users (preferences, experiences, jobs, etc.) in system utterances have been investigated. One way to do this is to utilize user information to fill in utterance templates for use in response generation, but the utterances do not always fit the context. Another way is to use neural-based generation, but in current methods, user information can be incorporated only when the current dialogue topic is similar to that of the user information. This paper tackled these problems by constructing a novel corpus to incorporate arbitrary user information into system utterances regardless of the current dialogue topic while retaining appropriateness for the context. We then fine-tuned a model for generating system utterances using the constructed corpus. The result of a subjective evaluation demonstrated the effectiveness of our model. Furthermore, we incorporated our fine-tuned model into a dialogue system and confirmed the effectiveness of the system through interactive dialogues with users.

pdf bib abs
JMultiWOZ: A Large-Scale Japanese Multi-Domain Task-Oriented Dialogue Dataset
Atsumoto Ohashi | Ryu Hirai | Shinya Iizuka | Ryuichiro Higashinaka
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Dialogue datasets are crucial for deep learning-based task-oriented dialogue system research. While numerous English language multi-domain task-oriented dialogue datasets have been developed and contributed to significant advancements in task-oriented dialogue systems, such a dataset does not exist in Japanese, and research in this area is limited compared to that in English. In this study, towards the advancement of research and development of task-oriented dialogue systems in Japanese, we constructed JMultiWOZ, the first Japanese language large-scale multi-domain task-oriented dialogue dataset. Using JMultiWOZ, we evaluated the dialogue state tracking and response generation capabilities of the state-of-the-art methods on the existing major English benchmark dataset MultiWOZ2.2 and the latest large language model (LLM)-based methods. Our evaluation results demonstrated that JMultiWOZ provides a benchmark that is on par with MultiWOZ2.2. In addition, through evaluation experiments of interactive dialogues with the models and human participants, we identified limitations in the task completion capabilities of LLMs in Japanese.

pdf bib abs
Estimating the Emotional Valence of Interlocutors Using Heterogeneous Sensors in Human-Human Dialogue
Jingjing Jiang | Ao Guo | Ryuichiro Higashinaka
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue

Dialogue systems need to accurately understand the user’s mental state to generate appropriate responses, but accurately discerning such states solely from text or speech can be challenging. To determine which information is necessary, we first collected human-human multimodal dialogues using heterogeneous sensors, resulting in a dataset containing various types of information including speech, video, physiological signals, gaze, and body movement. Additionally, for each time step of the data, users provided subjective evaluations of their emotional valence while reviewing the dialogue videos. Using this dataset and focusing on physiological signals, we analyzed the relationship between the signals and the subjective evaluations through Granger causality analysis. We also investigated how sensor signals differ depending on the polarity of the valence. Our findings revealed several physiological signals related to the user’s emotional valence.

2023

pdf bib abs
Enhancing Task-oriented Dialogue Systems with Generative Post-processing Networks
Atsumoto Ohashi | Ryuichiro Higashinaka
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Recently, post-processing networks (PPNs), which modify the outputs of arbitrary modules including non-differentiable ones in task-oriented dialogue systems, have been proposed. PPNs have successfully improved the dialogue performance by post-processing natural language understanding (NLU), dialogue state tracking (DST), and dialogue policy (Policy) modules with a classification-based approach. However, they cannot be applied to natural language generation (NLG) modules because the post-processing of the utterance output by the NLG module requires a generative approach. In this study, we propose a new post-processing component for NLG, generative post-processing networks (GenPPNs). For optimizing GenPPNs via reinforcement learning, the reward function incorporates dialogue act contribution, a new measure to evaluate the contribution of GenPPN-generated utterances with regard to task completion in dialogue. Through simulation and human evaluation experiments based on the MultiWOZ dataset, we confirmed that GenPPNs improve the task completion performance of task-oriented dialogue systems.

pdf bib
Modeling Collaborative Dialogue in Minecraft with Action-Utterance Model
Takuma Ichikawa | Ryuichiro Higashinaka
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics: Student Research Workshop

pdf bib
RealPersonaChat: A Realistic Persona Chat Corpus with Interlocutors’ Own Personalities
Sanae Yamashita | Koji Inoue | Ao Guo | Shota Mochizuki | Tatsuya Kawahara | Ryuichiro Higashinaka
Proceedings of the 37th Pacific Asia Conference on Language, Information and Computation

pdf bib abs
Applying Item Response Theory to Task-oriented Dialogue Systems for Accurately Determining User’s Task Success Ability
Ryu Hirai | Ao Guo | Ryuichiro Higashinaka
Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue

While task-oriented dialogue systems have improved, not all users can fully accomplish their tasks. Users with limited knowledge about the system may experience dialogue breakdowns or fail to achieve their tasks because they do not know how to interact with the system. For addressing this issue, it would be desirable to construct a system that can estimate the user’s task success ability and adapt to that ability. In this study, we propose a method that estimates this ability by applying item response theory (IRT), commonly used in education for estimating examinee abilities, to task-oriented dialogue systems. Through experiments predicting the probability of a correct answer to each slot by using the estimated task success ability, we found that the proposed method significantly outperformed baselines.

2022

pdf bib abs
Combining Argumentation Structure and Language Model for Generating Natural Argumentative Dialogue
Koh Mitsuda | Ryuichiro Higashinaka | Kuniko Saito
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Argumentative dialogue is an important process where speakers discuss a specific theme for consensus building or decision making. In previous studies for generating consistent argumentative dialogue, retrieval-based methods with hand-crafted argumentation structures have been used. In this study, we propose a method to generate natural argumentative dialogues by combining an argumentation structure and language model. We trained the language model to rewrite a proposition of an argumentation structure on the basis of its information, such as keywords and stance, into the next utterance while considering its context, and we used the model to rewrite propositions in the argumentation structure. We manually evaluated the generated dialogues and found that the proposed method significantly improved the naturalness of dialogues without losing consistency of argumentation.

pdf bib abs
Optimal Summaries for Enabling a Smooth Handover in Chat-Oriented Dialogue
Sanae Yamashita | Ryuichiro Higashinaka
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing: Student Research Workshop

In dialogue systems, one option for creating a better dialogue experience for the user is to have a human operator take over the dialogue when the system runs into trouble communicating with the user. In this type of handover situation (we call it intervention), it is useful for the operator to have access to the dialogue summary. However, it is not clear exactly what type of summary would be the most useful for a smooth handover. In this study, we investigated the optimal type of summary through experiments in which interlocutors were presented with various summary types during interventions in order to examine their effects. Our findings showed that the best summaries were an abstractive summary plus one utterance immediately before the handover and an extractive summary consisting of five utterances immediately before the handover. From the viewpoint of computational cost, we recommend that extractive summaries consisting of the last five utterances be used.

pdf bib abs
Investigating person-specific errors in chat-oriented dialogue systems
Koh Mitsuda | Ryuichiro Higashinaka | Tingxuan Li | Sen Yoshida
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Creating chatbots to behave like real people is important in terms of believability. Errors in general chatbots and chatbots that follow a rough persona have been studied, but those in chatbots that behave like real people have not been thoroughly investigated. We collected a large amount of user interactions of a generation-based chatbot trained from large-scale dialogue data of a specific character, i.e., target person, and analyzed errors related to that person. We found that person-specific errors can be divided into two types: errors in attributes and those in relations, each of which can be divided into two levels: self and other. The correspondence with an existing taxonomy of errors was also investigated, and person-specific errors that should be addressed in the future were clarified.

pdf bib abs
Adaptive Natural Language Generation for Task-oriented Dialogue via Reinforcement Learning
Atsumoto Ohashi | Ryuichiro Higashinaka
Proceedings of the 29th International Conference on Computational Linguistics

When a natural language generation (NLG) component is implemented in a real-world task-oriented dialogue system, it is necessary to generate not only natural utterances as learned on training data but also utterances adapted to the dialogue environment (e.g., noise from environmental sounds) and the user (e.g., users with low levels of understanding ability). Inspired by recent advances in reinforcement learning (RL) for language generation tasks, we propose ANTOR, a method for Adaptive Natural language generation for Task-Oriented dialogue via Reinforcement learning. In ANTOR, a natural language understanding (NLU) module, which corresponds to the user’s understanding of system utterances, is incorporated into the objective function of RL. If the NLG’s intentions are correctly conveyed to the NLU, which understands a system’s utterances, the NLG is given a positive reward. We conducted experiments on the MultiWOZ dataset, and we confirmed that ANTOR could generate adaptive utterances against speech recognition errors and the different vocabulary levels of users.

pdf bib abs
A Speculative and Tentative Common Ground Handling for Efficient Composition of Uncertain Dialogue
Saki Sudo | Kyoshiro Asano | Koh Mitsuda | Ryuichiro Higashinaka | Yugo Takeuchi
Proceedings of the Thirteenth Language Resources and Evaluation Conference

This study investigates how the grounding process is composed and explores new interaction approaches that adapt to human cognitive processes that have not yet been significantly studied. The results of an experiment indicate that grounding through dialogue is mutually accepted among participants through holistic expressions and suggest that common ground among participants may not necessarily be formed in a bottom-up way through analytic expressions. These findings raise the possibility of a promising new approach to creating a human-like dialogue system that may be more suitable for natural human communication.

pdf bib abs
Analysis of Dialogue in Human-Human Collaboration in Minecraft
Takuma Ichikawa | Ryuichiro Higashinaka
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Recently, many studies have focused on developing dialogue systems that enable collaborative work; however, they rarely focus on creative tasks. Collaboration for creative work, in which humans and systems collaborate to create new value, will be essential for future dialogue systems. In this study, we collected 500 dialogues of human-human collaboration in Minecraft as a basis for developing a dialogue system that enables creative collaborative work. We conceived the Collaborative Garden Task, where two workers interact and collaborate in Minecraft to create a garden, and we collected dialogue, action logs, and subjective evaluations. We also collected third-person evaluations of the gardens and analyzed the relationship between dialogue and collaborative work that received high scores on the subjective and third-person evaluations in order to identify dialogic factors for high-quality collaborative work. We found that two essential aspects in creative collaborative work are performing more processes to ask for and agree on suggestions between workers and agreeing on a particular image of the final product in the early phase of work and then discussing changes and details.

pdf bib abs
Data Collection for Empirically Determining the Necessary Information for Smooth Handover in Dialogue
Sanae Yamashita | Ryuichiro Higashinaka
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Despite recent advances, dialogue systems still struggle to achieve fully autonomous transactions. Therefore, when a system encounters a problem, human operators need to take over the dialogue to complete the transaction. However, it is unclear what information should be presented to the operator when this handover takes place. In this study, we conducted a data collection experiment in which one of two operators talked to a user and switched with the other operator periodically while exchanging notes when the handovers took place. By examining these notes, it is possible to identify the information necessary for handing over the dialogue. We collected 60 dialogues in which two operators switched periodically while performing chat, consultation, and sales tasks in dialogue. We found that adjacency pairs are a useful representation for recording conversation history. In addition, we found that key-value-pair representation is also useful when there are underlying tasks, such as consultation and sales.

pdf bib abs
Dialogue Corpus Construction Considering Modality and Social Relationships in Building Common Ground
Yuki Furuya | Koki Saito | Kosuke Ogura | Koh Mitsuda | Ryuichiro Higashinaka | Kazunori Takashio
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Building common ground with users is essential for dialogue agent systems and robots to interact naturally with people. While a few previous studies have investigated the process of building common ground in human-human dialogue, most of them have been conducted on the basis of text chat. In this study, we constructed a dialogue corpus to investigate the process of building common ground with a particular focus on the modality of dialogue and the social relationship between the participants in the process of building common ground, which are important but have not been investigated in the previous work. The results of our analysis suggest that adding the modality or developing the relationship between workers speeds up the building of common ground. Specifically, regarding the modality, the presence of video rather than only audio may unconsciously facilitate work, and as for the relationship, it is easier to convey information about emotions and turn-taking among friends than in first meetings. These findings and the corpus should prove useful for developing a system to support remote communication.

pdf bib abs
Dialogue Collection for Recording the Process of Building Common Ground in a Collaborative Task
Koh Mitsuda | Ryuichiro Higashinaka | Yuhei Oga | Sen Yoshida
Proceedings of the Thirteenth Language Resources and Evaluation Conference

To develop a dialogue system that can build common ground with users, the process of building common ground through dialogue needs to be clarified. However, the studies on the process of building common ground have not been well conducted; much work has focused on finding the relationship between a dialogue in which users perform a collaborative task and its task performance represented by the final result of the task. In this study, to clarify the process of building common ground, we propose a data collection method for automatically recording the process of building common ground through a dialogue by using the intermediate result of a task. We collected 984 dialogues, and as a result of investigating the process of building common ground, we found that the process can be classified into several typical patterns and that conveying each worker’s understanding through affirmation of a counterpart’s utterances especially contributes to building common ground. In addition, toward dialogue systems that can build common ground, we conducted an automatic estimation of the degree of built common ground and found that its degree can be estimated quite accurately.

pdf bib abs
Collection and Analysis of Travel Agency Task Dialogues with Age-Diverse Speakers
Michimasa Inaba | Yuya Chiba | Ryuichiro Higashinaka | Kazunori Komatani | Yusuke Miyao | Takayuki Nagai
Proceedings of the Thirteenth Language Resources and Evaluation Conference

When individuals communicate with each other, they use different vocabulary, speaking speed, facial expressions, and body language depending on the people they talk to. This paper focuses on the speaker’s age as a factor that affects the change in communication. We collected a multimodal dialogue corpus with a wide range of speaker ages. As a dialogue task, we focus on travel, which interests people of all ages, and we set up a task based on a tourism consultation between an operator and a customer at a travel agency. This paper provides details of the dialogue task, the collection procedure and annotations, and the analysis on the characteristics of the dialogues and facial expressions focusing on the age of the speakers. Results of the analysis suggest that the adult speakers have more independent opinions, the older speakers more frequently express their opinions frequently compared with other age groups, and the operators expressed a smile more frequently to the minor speakers.

pdf bib abs
Post-processing Networks: Method for Optimizing Pipeline Task-oriented Dialogue Systems using Reinforcement Learning
Atsumoto Ohashi | Ryuichiro Higashinaka
Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue

Many studies have proposed methods for optimizing the dialogue performance of an entire pipeline task-oriented dialogue system by jointly training modules in the system using reinforcement learning. However, these methods are limited in that they can only be applied to modules implemented using trainable neural-based methods. To solve this problem, we propose a method for optimizing a pipeline system composed of modules implemented with arbitrary methods for dialogue performance. With our method, neural-based components called post-processing networks (PPNs) are installed inside such a system to post-process the output of each module. All PPNs are updated to improve the overall dialogue performance of the system by using reinforcement learning, not necessitating each module to be differentiable. Through dialogue simulation and human evaluation on the MultiWOZ dataset, we show that our method can improve the dialogue performance of pipeline systems consisting of various modules.

2021

pdf bib abs
Influence of user personality on dialogue task performance: A case study using a rule-based dialogue system
Ao Guo | Atsumoto Ohashi | Ryu Hirai | Yuya Chiba | Yuiko Tsunomori | Ryuichiro Higashinaka
Proceedings of the 3rd Workshop on Natural Language Processing for Conversational AI

Endowing a task-oriented dialogue system with adaptiveness to user personality can greatly help improve the performance of a dialogue task. However, such a dialogue system can be practically challenging to implement, because it is unclear how user personality influences dialogue task performance. To explore the relationship between user personality and dialogue task performance, we enrolled participants via crowdsourcing to first answer specified personality questionnaires and then chat with a dialogue system to accomplish assigned tasks. A rule-based dialogue system on the prevalent Multi-Domain Wizard-of-Oz (MultiWOZ) task was used. A total of 211 participants’ personalities and their 633 dialogues were collected and analyzed. The results revealed that sociable and extroverted people tended to fail the task, whereas neurotic people were more likely to succeed. We extracted features related to user dialogue behaviors and performed further analysis to determine which kind of behavior influences task performance. As a result, we identified that average utterance length and slots per utterance are the key features of dialogue behavior that are highly correlated with both task performance and user personality.

pdf bib
Variation across Everyday Conversations: Factor Analysis of Conversations using Semantic Categories of Functional Expressions
Yuya Chiba | Ryuichiro Higashinaka
Proceedings of the 35th Pacific Asia Conference on Language, Information and Computation

pdf bib abs
Task Definition and Integration For Scientific-Document Writing Support
Hiromi Narimatsu | Kohei Koyama | Kohji Dohsaka | Ryuichiro Higashinaka | Yasuhiro Minami | Hirotoshi Taira
Proceedings of the Second Workshop on Scholarly Document Processing

With the increase in the number of published academic papers, growing expectations have been placed on research related to supporting the writing process of scientific papers. Recently, research has been conducted on various tasks such as citation worthiness (judging whether a sentence requires citation), citation recommendation, and citation-text generation. However, since each task has been studied and evaluated using data that has been independently developed, it is currently impossible to verify whether such tasks can be successfully pipelined to effective use in scientific-document writing. In this paper, we first define a series of tasks related to scientific-document writing that can be pipelined. Then, we create a dataset of academic papers that can be used for the evaluation of each task as well as a series of these tasks. Finally, using the dataset, we evaluate the tasks of citation worthiness and citation recommendation as well as both of these tasks integrated. The results of our evaluations show that the proposed approach is promising.

pdf bib abs
Integrated taxonomy of errors in chat-oriented dialogue systems
Ryuichiro Higashinaka | Masahiro Araki | Hiroshi Tsukahara | Masahiro Mizukami
Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue

This paper proposes a taxonomy of errors in chat-oriented dialogue systems. Previously, two taxonomies were proposed; one is theory-driven and the other data-driven. The former suffers from the fact that dialogue theories for human conversation are often not appropriate for categorizing errors made by chat-oriented dialogue systems. The latter has limitations in that it can only cope with errors of systems for which we have data. This paper integrates these two taxonomies to create a comprehensive taxonomy of errors in chat-oriented dialogue systems. We found that, with our integrated taxonomy, errors can be reliably annotated with a higher Fleiss’ kappa compared with the previously proposed taxonomies.

2020

This paper concerns the problem of realizing consistent personalities in neural conversational modeling by using user generated question-answer pairs as training data. Using the framework of role play-based question answering, we collected single-turn question-answer pairs for particular characters from online users. Meta information was also collected such as emotion and intimacy related to question-answer pairs. We verified the quality of the collected data and, by subjective evaluation, we also verified their usefulness in training neural conversational models for generating utterances reflecting the meta information, especially emotion.

pdf bib abs
Collection and Analysis of Dialogues Provided by Two Speakers Acting as One
Tsunehiro Arimoto | Ryuichiro Higashinaka | Kou Tanaka | Takahito Kawanishi | Hiroaki Sugiyama | Hiroshi Sawada | Hiroshi Ishiguro
Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue

We are studying a cooperation style where multiple speakers can provide both advanced dialogue services and operator education. We focus on a style in which two operators interact with a user by pretending to be a single operator. For two operators to effectively act as one, each must adjust his/her conversational content and timing to the other. In the process, we expect each operator to experience the conversational content of his/her partner as if it were his/her own, creating efficient and effective learning of the other’s skill. We analyzed this educational effect and examined whether dialogue services can be successfully provided by collecting travel guidance dialogue data from operators who give travel information to users. In this paper, we report our preliminary results on dialogue content and user satisfaction of operators and users.

2018

pdf bib abs
Multi-task and Multi-lingual Joint Learning of Neural Lexical Utterance Classification based on Partially-shared Modeling
Ryo Masumura | Tomohiro Tanaka | Ryuichiro Higashinaka | Hirokazu Masataki | Yushi Aono
Proceedings of the 27th International Conference on Computational Linguistics

This paper is an initial study on multi-task and multi-lingual joint learning for lexical utterance classification. A major problem in constructing lexical utterance classification modules for spoken dialogue systems is that individual data resources are often limited or unbalanced among tasks and/or languages. Various studies have examined joint learning using neural-network based shared modeling; however, previous joint learning studies focused on either cross-task or cross-lingual knowledge transfer. In order to simultaneously support both multi-task and multi-lingual joint learning, our idea is to explicitly divide state-of-the-art neural lexical utterance classification into language-specific components that can be shared between different tasks and task-specific components that can be shared between different languages. In addition, in order to effectively transfer knowledge between different task data sets and different language data sets, this paper proposes a partially-shared modeling method that possesses both shared components and components specific to individual data sets. We demonstrate the effectiveness of proposed method using Japanese and English data sets with three different lexical utterance classification tasks.

pdf bib abs
Adversarial Training for Multi-task and Multi-lingual Joint Modeling of Utterance Intent Classification
Ryo Masumura | Yusuke Shinohara | Ryuichiro Higashinaka | Yushi Aono
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

This paper proposes an adversarial training method for the multi-task and multi-lingual joint modeling needed for utterance intent classification. In joint modeling, common knowledge can be efficiently utilized among multiple tasks or multiple languages. This is achieved by introducing both language-specific networks shared among different tasks and task-specific networks shared among different languages. However, the shared networks are often specialized in majority tasks or languages, so performance degradation must be expected for some minor data sets. In order to improve the invariance of shared networks, the proposed method introduces both language-specific task adversarial networks and task-specific language adversarial networks; both are leveraged for purging the task or language dependencies of the shared networks. The effectiveness of the adversarial training proposal is demonstrated using Japanese and English data sets for three different utterance intent classification tasks.

pdf bib
Predicting Nods by using Dialogue Acts in Dialogue
Ryo Ishii | Ryuichiro Higashinaka | Junji Tomita
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Creating Large-Scale Argumentation Structures for Dialogue Systems
Kazuki Sakai | Akari Inago | Ryuichiro Higashinaka | Yuichiro Yoshikawa | Hiroshi Ishiguro | Junji Tomita
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib abs
Introduction method for argumentative dialogue using paired question-answering interchange about personality
Kazuki Sakai | Ryuichiro Higashinaka | Yuichiro Yoshikawa | Hiroshi Ishiguro | Junji Tomita
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue

To provide a better discussion experience in current argumentative dialogue systems, it is necessary for the user to feel motivated to participate, even if the system already responds appropriately. In this paper, we propose a method that can smoothly introduce argumentative dialogue by inserting an initial discourse, consisting of question-answer pairs concerning personality. The system can induce interest of the users prior to agreement or disagreement during the main discourse. By disclosing their interests, the users will feel familiarity and motivation to further engage in the argumentative dialogue and understand the system’s intent. To verify the effectiveness of a question-answer dialogue inserted before the argument, a subjective experiment was conducted using a text chat interface. The results suggest that inserting the question-answer dialogue enhances familiarity and naturalness. Notably, the results suggest that women more than men regard the dialogue as more natural and the argument as deepened, following an exchange concerning personality.

This paper proposes a fully neural network based dialogue-context online end-of-turn detection method that can utilize long-range interactive information extracted from both speaker’s utterances and collocutor’s utterances. The proposed method combines multiple time-asynchronous long short-term memory recurrent neural networks, which can capture speaker’s and collocutor’s multiple sequential features, and their interactions. On the assumption of applying the proposed method to spoken dialogue systems, we introduce speaker’s acoustic sequential features and collocutor’s linguistic sequential features, each of which can be extracted in an online manner. Our evaluation confirms the effectiveness of taking dialogue context formed by the speaker’s utterances and collocutor’s utterances into consideration.

pdf bib abs
Role play-based question-answering by real users for building chatbots with consistent personalities
Ryuichiro Higashinaka | Masahiro Mizukami | Hidetoshi Kawabata | Emi Yamaguchi | Noritake Adachi | Junji Tomita
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue

Having consistent personalities is important for chatbots if we want them to be believable. Typically, many question-answer pairs are prepared by hand for achieving consistent responses; however, the creation of such pairs is costly. In this study, our goal is to collect a large number of question-answer pairs for a particular character by using role play-based question-answering in which multiple users play the roles of certain characters and respond to questions by online users. Focusing on two famous characters, we conducted a large-scale experiment to collect question-answer pairs by using real users. We evaluated the effectiveness of role play-based question-answering and found that, by using our proposed method, the collected pairs lead to good-quality chatbots that exhibit consistent personalities.

2017

pdf bib abs
Hyperspherical Query Likelihood Models with Word Embeddings
Ryo Masumura | Taichi Asami | Hirokazu Masataki | Kugatsu Sadamitsu | Kyosuke Nishida | Ryuichiro Higashinaka
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

This paper presents an initial study on hyperspherical query likelihood models (QLMs) for information retrieval (IR). Our motivation is to naturally utilize pre-trained word embeddings for probabilistic IR. To this end, key idea is to directly leverage the word embeddings as random variables for directional probabilistic models based on von Mises-Fisher distributions which are familiar to cosine distances. The proposed method enables us to theoretically take semantic similarities between document and target queries into consideration without introducing heuristic expansion techniques. In addition, this paper reveals relationships between hyperspherical QLMs and conventional QLMs. Experiments show document retrieval evaluation results in which a hyperspherical QLM is compared to conventional QLMs and document distance metrics using word or document embeddings.

pdf bib abs
Investigating the Effect of Conveying Understanding Results in Chat-Oriented Dialogue Systems
Koh Mitsuda | Ryuichiro Higashinaka | Junji Tomita
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

In dialogue systems, conveying understanding results of user utterances is important because it enables users to feel understood by the system. However, it is not clear what types of understanding results should be conveyed to users; some utterances may be offensive and some may be too commonsensical. In this paper, we explored the effect of conveying understanding results of user utterances in a chat-oriented dialogue system by an experiment using human subjects. As a result, we found that only certain types of understanding results, such as those related to a user’s permanent state, are effective to improve user satisfaction. This paper clarifies the types of understanding results that can be safely uttered by a system.

2016

pdf bib abs
The dialogue breakdown detection challenge: Task description, datasets, and evaluation metrics
Ryuichiro Higashinaka | Kotaro Funakoshi | Yuka Kobayashi | Michimasa Inaba
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Dialogue breakdown detection is a promising technique in dialogue systems. To promote the research and development of such a technique, we organized a dialogue breakdown detection challenge where the task is to detect a system’s inappropriate utterances that lead to dialogue breakdowns in chat. This paper describes the design, datasets, and evaluation metrics for the challenge as well as the methods and results of the submitted runs of the participants.

pdf bib
Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Raquel Fernandez | Wolfgang Minker | Giuseppe Carenini | Ryuichiro Higashinaka | Ron Artstein | Alesia Gainer
Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue

pdf bib
Analyzing Post-dialogue Comments by Speakers – How Do Humans Personalize Their Utterances in Dialogue? –
Toru Hirano | Ryuichiro Higashinaka | Yoshihiro Matsuo
Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue

pdf bib
Towards an Entertaining Natural Language Generation System: Linguistic Peculiarities of Japanese Fictional Characters
Chiaki Miyazaki | Toru Hirano | Ryuichiro Higashinaka | Yoshihiro Matsuo
Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue

pdf bib abs
A Hierarchical Neural Network for Information Extraction of Product Attribute and Condition Sentences
Yukinori Homma | Kugatsu Sadamitsu | Kyosuke Nishida | Ryuichiro Higashinaka | Hisako Asano | Yoshihiro Matsuo
Proceedings of the Open Knowledge Base and Question Answering Workshop (OKBQA 2016)

This paper describes a hierarchical neural network we propose for sentence classification to extract product information from product documents. The network classifies each sentence in a document into attribute and condition classes on the basis of word sequences and sentence sequences in the document. Experimental results showed the method using the proposed network significantly outperformed baseline methods by taking semantic representation of word and sentence sequential data into account. We also evaluated the network with two different product domains (insurance and tourism domains) and found that it was effective for both the domains.

2015

pdf bib
Fatal or not? Finding errors that lead to dialogue breakdowns in chat-oriented dialogue systems
Ryuichiro Higashinaka | Masahiro Mizukami | Kotaro Funakoshi | Masahiro Araki | Hiroshi Tsukahara | Yuka Kobayashi
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Classification and Acquisition of Contradictory Event Pairs using Crowdsourcing
Yu Takabatake | Hajime Morita | Daisuke Kawahara | Sadao Kurohashi | Ryuichiro Higashinaka | Yoshihiro Matsuo
Proceedings of the 3rd Workshop on EVENTS: Definition, Detection, Coreference, and Representation

pdf bib
Towards Taxonomy of Errors in Chat-oriented Dialogue Systems
Ryuichiro Higashinaka | Kotaro Funakoshi | Masahiro Araki | Hiroshi Tsukahara | Yuka Kobayashi | Masahiro Mizukami
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue

pdf bib
Discourse Relation Recognition by Comparing Various Units of Sentence Expression with Recursive Neural Network
Atsushi Otsuka | Toru Hirano | Chiaki Miyazaki | Ryo Masumura | Ryuichiro Higashinaka | Toshiro Makino | Yoshihiro Matsuo
Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation

pdf bib
Automatic conversion of sentence-end expressions for utterance characterization of dialogue systems
Chiaki Miyazaki | Toru Hirano | Ryuichiro Higashinaka | Toshiro Makino | Yoshihiro Matsuo
Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation

2014

pdf bib
Predicate-Argument Structure Analysis with Zero-Anaphora Resolution for Dialogue Systems
Kenji Imamura | Ryuichiro Higashinaka | Tomoko Izumi
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib abs
Extraction of Daily Changing Words for Question Answering
Kugatsu Sadamitsu | Ryuichiro Higashinaka | Yoshihiro Matsuo
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This paper proposes a method for extracting Daily Changing Words (DCWs), words that indicate which questions are real-time dependent. Our approach is based on two types of template matching using time and named entity slots from large size corpora and adding simple filtering methods from news corpora. Extracted DCWs are utilized for detecting and sorting real-time dependent questions. Experiments confirm that our DCW method achieves higher accuracy in detecting real-time dependent questions than existing word classes and a simple supervised machine learning approach.

This paper describes a dialogue data collection experiment and resulting corpus for dialogues between a senior mobile journalist and a junior cub reporter back at the office. The purpose of the dialogue is for the mobile journalist to collect background information in preparation for an interview or on-the-site coverage of a breaking story. The cub reporter has access to text archives that contain such background information. A unique aspect of these dialogues is that they capture information-seeking behavior for an open-ended task against a large unstructured data source. Initial analyses of the corpus show that the experimental design leads to real-time, mixedinitiative, highly interactive dialogues with many interesting properties.

pdf bib
Learning to Generate Naturalistic Utterances Using Reviews in Spoken Dialogue Systems
Ryuichiro Higashinaka | Rashmi Prasad | Marilyn A. Walker
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics