2024
pdf
bib
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Tatsuya Kawahara
|
Vera Demberg
|
Stefan Ultes
|
Koji Inoue
|
Shikib Mehri
|
David Howcroft
|
Kazunori Komatani
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue
pdf
bib
abs
DialBB: A Dialogue System Development Framework as an Educational Material
Mikio Nakano
|
Kazunori Komatani
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue
We demonstrate DialBB, a dialogue system development framework, which we have been building as an educational material for dialogue system technology. Building a dialogue system requires the adoption of an appropriate architecture depending on the application and the integration of various technologies. However, this is not easy for those who have just started learning dialogue system technology. Therefore, there is a demand for educational materials that integrate various technologies to build dialogue systems, because traditional dialogue system development frameworks were not designed for educational purposes. DialBB enables the development of dialogue systems by combining modules called building blocks. After understanding sample applications, learners can easily build simple systems using built-in blocks and can build advanced systems using their own developed blocks.
pdf
bib
abs
Collecting Human-Agent Dialogue Dataset with Frontal Brain Signal toward Capturing Unexpressed Sentiment
Shun Katada
|
Ryu Takeda
|
Kazunori Komatani
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Multimodal information such as text and audiovisual data has been used for emotion/sentiment estimation during human-agent dialogue; however, user sentiments are not necessarily expressed explicitly during dialogues. Biosignals such as brain signals recorded using an electroencephalogram (EEG) sensor have been the subject of focus in affective computing regions to capture unexpressed emotional changes in a controlled experimental environment. In this study, we collect and analyze multimodal data with an EEG during a human-agent dialogue toward capturing unexpressed sentiment. Our contributions are as follows: (1) a new multimodal human-agent dialogue dataset is created, which includes not only text and audiovisual data but also frontal EEGs and physiological signals during the dialogue. In total, about 500-minute chat dialogues were collected from thirty participants aged 20 to 70. (2) We present a novel method for dealing with eye-blink noise for frontal EEGs denoising. This method applies facial landmark tracking to detect and delete eye-blink noise. (3) An experimental evaluation showed the effectiveness of the frontal EEGs. It improved sentiment estimation performance when used with other modalities by multimodal fusion, although it only has three channels.
2023
pdf
bib
abs
Analyzing Differences in Subjective Annotations by Participants and Third-party Annotators in Multimodal Dialogue Corpus
Kazunori Komatani
|
Ryu Takeda
|
Shogo Okada
Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Estimating the subjective impressions of human users during a dialogue is necessary when constructing a dialogue system that can respond adaptively to their emotional states. However, such subjective impressions (e.g., how much the user enjoys the dialogue) are inherently ambiguous, and the annotation results provided by multiple annotators do not always agree because they depend on the subjectivity of the annotators. In this paper, we analyzed the annotation results using 13,226 exchanges from 155 participants in a multimodal dialogue corpus called Hazumi that we had constructed, where each exchange was annotated by five third-party annotators. We investigated the agreement between the subjective annotations given by the third-party annotators and the participants themselves, on both per-exchange annotations (i.e., participant’s sentiments) and per-dialogue (-participant) annotations (i.e., questionnaires on rapport and personality traits). We also investigated the conditions under which the annotation results are reliable. Our findings demonstrate that the dispersion of third-party sentiment annotations correlates with agreeableness of the participants, one of the Big Five personality traits.
2022
pdf
bib
abs
Graph-combined Coreference Resolution Methods on Conversational Machine Reading Comprehension with Pre-trained Language Model
Zhaodong Wang
|
Kazunori Komatani
Proceedings of the Second DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering
Coreference resolution such as for anaphora has been an essential challenge that is commonly found in conversational machine reading comprehension (CMRC). This task aims to determine the referential entity to which a pronoun refers on the basis of contextual information. Existing approaches based on pre-trained language models (PLMs) mainly rely on an end-to-end method, which still has limitations in clarifying referential dependency. In this study, a novel graph-based approach is proposed to integrate the coreference of given text into graph structures (called coreference graphs), which can pinpoint a pronoun’s referential entity. We propose two graph-combined methods, evidence-enhanced and the fusion model, for CMRC to integrate coreference graphs from different levels of the PLM architecture. Evidence-enhanced refers to textual level methods that include an evidence generator (for generating new text to elaborate a pronoun) and enhanced question (for rewriting a pronoun in a question) as PLM input. The fusion model is a structural level method that combines the PLM with a graph neural network. We evaluated these approaches on a CoQA pronoun-containing dataset and the whole CoQA dataset. The result showed that our methods can outperform baseline PLM methods with BERT and RoBERTa.
pdf
bib
abs
Collection and Analysis of Travel Agency Task Dialogues with Age-Diverse Speakers
Michimasa Inaba
|
Yuya Chiba
|
Ryuichiro Higashinaka
|
Kazunori Komatani
|
Yusuke Miyao
|
Takayuki Nagai
Proceedings of the Thirteenth Language Resources and Evaluation Conference
When individuals communicate with each other, they use different vocabulary, speaking speed, facial expressions, and body language depending on the people they talk to. This paper focuses on the speaker’s age as a factor that affects the change in communication. We collected a multimodal dialogue corpus with a wide range of speaker ages. As a dialogue task, we focus on travel, which interests people of all ages, and we set up a task based on a tourism consultation between an operator and a customer at a travel agency. This paper provides details of the dialogue task, the collection procedure and annotations, and the analysis on the characteristics of the dialogues and facial expressions focusing on the age of the speakers. Results of the analysis suggest that the adult speakers have more independent opinions, the older speakers more frequently express their opinions frequently compared with other age groups, and the operators expressed a smile more frequently to the minor speakers.
2020
pdf
bib
abs
User Impressions of Questions to Acquire Lexical Knowledge
Kazunori Komatani
|
Mikio Nakano
Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue
For the acquisition of knowledge through dialogues, it is crucial for systems to ask questions that do not diminish the user’s willingness to talk, i.e., that do not degrade the user’s impression. This paper reports the results of our analysis on how user impression changes depending on the types of questions to acquire lexical knowledge, that is, explicit and implicit questions, and the correctness of the content of the questions. We also analyzed how sequences of the same type of questions affect user impression. User impression scores were collected from 104 participants recruited via crowdsourcing and then regression analysis was conducted. The results demonstrate that implicit questions give a good impression when their content is correct, but a bad impression otherwise. We also found that consecutive explicit questions are more annoying than implicit ones when the content of the questions is correct. Our findings reveal helpful insights for creating a strategy to avoid user impression deterioration during knowledge acquisition.
2018
pdf
bib
Collection of Multimodal Dialog Data and Analysis of the Result of Annotation of Users’ Interest Level
Masahiro Araki
|
Sayaka Tomimasu
|
Mikio Nakano
|
Kazunori Komatani
|
Shogo Okada
|
Shinya Fujie
|
Hiroaki Sugiyama
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
pdf
bib
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue
Kazunori Komatani
|
Diane Litman
|
Kai Yu
|
Alex Papangelis
|
Lawrence Cavedon
|
Mikio Nakano
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue
2017
pdf
bib
abs
Unsupervised Segmentation of Phoneme Sequences based on Pitman-Yor Semi-Markov Model using Phoneme Length Context
Ryu Takeda
|
Kazunori Komatani
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Unsupervised segmentation of phoneme sequences is an essential process to obtain unknown words during spoken dialogues. In this segmentation, an input phoneme sequence without delimiters is converted into segmented sub-sequences corresponding to words. The Pitman-Yor semi-Markov model (PYSMM) is promising for this problem, but its performance degrades when it is applied to phoneme-level word segmentation. This is because of insufficient cues for the segmentation, e.g., homophones are improperly treated as single entries and their different contexts are also confused. We propose a phoneme-length context model for PYSMM to give a helpful cue at the phoneme-level and to predict succeeding segments more accurately. Our experiments showed that the peak performance with our context model outperformed those without such a context model by 0.045 at most in terms of F-measures of estimated segmentation.
pdf
bib
abs
Lexical Acquisition through Implicit Confirmations over Multiple Dialogues
Kohei Ono
|
Ryu Takeda
|
Eric Nichols
|
Mikio Nakano
|
Kazunori Komatani
Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue
We address the problem of acquiring the ontological categories of unknown terms through implicit confirmation in dialogues. We develop an approach that makes implicit confirmation requests with an unknown term’s predicted category. Our approach does not degrade user experience with repetitive explicit confirmations, but the system has difficulty determining if information in the confirmation request can be correctly acquired. To overcome this challenge, we propose a method for determining whether or not the predicted category is correct, which is included in an implicit confirmation request. Our method exploits multiple user responses to implicit confirmation requests containing the same ontological category. Experimental results revealed that the proposed method exhibited a higher precision rate for determining the correctly predicted categories than when only single user responses were considered.
2016
pdf
bib
abs
Bayesian Language Model based on Mixture of Segmental Contexts for Spontaneous Utterances with Unexpected Words
Ryu Takeda
|
Kazunori Komatani
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
This paper describes a Bayesian language model for predicting spontaneous utterances. People sometimes say unexpected words, such as fillers or hesitations, that cause the miss-prediction of words in normal N-gram models. Our proposed model considers mixtures of possible segmental contexts, that is, a kind of context-word selection. It can reduce negative effects caused by unexpected words because it represents conditional occurrence probabilities of a word as weighted mixtures of possible segmental contexts. The tuning of mixture weights is the key issue in this approach as the segment patterns becomes numerous, thus we resolve it by using Bayesian model. The generative process is achieved by combining the stick-breaking process and the process used in the variable order Pitman-Yor language model. Experimental evaluations revealed that our model outperformed contiguous N-gram models in terms of perplexity for noisy text including hesitations.
2015
pdf
bib
User Adaptive Restoration for Incorrectly-Segmented Utterances in Spoken Dialogue Systems
Kazunori Komatani
|
Naoki Hotta
|
Satoshi Sato
|
Mikio Nakano
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue
2013
pdf
bib
Generating More Specific Questions for Acquiring Attributes of Unknown Concepts from Users
Tsugumi Otsuka
|
Kazunori Komatani
|
Satoshi Sato
|
Mikio Nakano
Proceedings of the SIGDIAL 2013 Conference
2011
pdf
bib
A Two-Stage Domain Selection Framework for Extensible Multi-Domain Spoken Dialogue Systems
Mikio Nakano
|
Shun Sato
|
Kazunori Komatani
|
Kyoko Matsuyama
|
Kotaro Funakoshi
|
Hiroshi G. Okuno
Proceedings of the SIGDIAL 2011 Conference
2010
pdf
bib
Online Error Detection of Barge-In Utterances by Using Individual Users’ Utterance Histories in Spoken Dialogue System
Kazunori Komatani
|
Hiroshi G. Okuno
Proceedings of the SIGDIAL 2010 Conference
pdf
bib
Automatic Allocation of Training Data for Rapid Prototyping of Speech Understanding based on Multiple Model Combination
Kazunori Komatani
|
Masaki Katsumaru
|
Mikio Nakano
|
Kotaro Funakoshi
|
Tetsuya Ogata
|
Hiroshi G. Okuno
Coling 2010: Posters
2009
pdf
bib
A Speech Understanding Framework that Uses Multiple Language Models and Multiple Understanding Models
Masaki Katsumaru
|
Mikio Nakano
|
Kazunori Komatani
|
Kotaro Funakoshi
|
Tetsuya Ogata
|
Hiroshi G. Okuno
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
pdf
bib
Ranking Help Message Candidates Based on Robust Grammar Verification Results and Utterance History in Spoken Dialogue Systems
Kazunori Komatani
|
Satoshi Ikeda
|
Yuichiro Fukubayashi
|
Tetsuya Ogata
|
Hiroshi Okuno
Proceedings of the SIGDIAL 2009 Conference
pdf
bib
Predicting Barge-in Utterance Errors by using Implicitly-Supervised ASR Accuracy and Barge-in Rate per User
Kazunori Komatani
|
Alexander I. Rudnicky
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
2008
pdf
bib
Rapid Prototyping of Robust Language Understanding Modules for Spoken Dialogue Systems
Yuichiro Fukubayashi
|
Kazunori Komatani
|
Mikio Nakano
|
Kotaro Funakoshi
|
Hiroshi Tsujino
|
Tetsuya Ogata
|
Hiroshi G. Okuno
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I
2007
pdf
bib
Introducing Utterance Verification in Spoken Dialogue System to Improve Dynamic Help Generation for Novice Users
Kazunori Komatani
|
Yuichiro Fukubayashi
|
Tetsuya Ogata
|
Hiroshi G. Okuno
Proceedings of the 8th SIGdial Workshop on Discourse and Dialogue
2006
pdf
bib
Multi-Domain Spoken Dialogue System with Extensibility and Robustness against Speech Recognition Errors
Kazunori Komatani
|
Naoyuki Kanda
|
Mikio Nakano
|
Kazuhiro Nakadai
|
Hiroshi Tsujino
|
Tetsuya Ogata
|
Hiroshi G. Okuno
Proceedings of the 7th SIGdial Workshop on Discourse and Dialogue
2005
pdf
bib
Empirical Verification of Meaning-Game-based Generalization of Centering Theory with Large Japanese Corpus
Shun Shiramatsu
|
Kazunori Komatani
|
Takashi Miyata
|
Koichi Hashida
|
Hiroshi Okuno
Proceedings of the 19th Pacific Asia Conference on Language, Information and Computation
2004
pdf
bib
Efficient Confirmation Strategy for Large-scale Text Retrieval Systems with Spoken Dialogue Interface
Kazunori Komatani
|
Teruhisa Misu
|
Tatsuya Kawahara
|
Hiroshi G. Okuno
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics
2003
pdf
bib
Flexible Guidance Generation Using User Model in Spoken Dialogue Systems
Kazunori Komatani
|
Shinichi Ueno
|
Tatsuya Kawahara
|
Hiroshi G. Okuno
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics
pdf
bib
Dialog Navigator : A Spoken Dialog Q-A System based on Large Text Knowledge Base
Yoji Kiyota
|
Sadao Kurohashi
|
Teruhisa Misu
|
Kazunori Komatani
|
Tatsuya Kawahara
|
Fuyuko Kido
The Companion Volume to the Proceedings of 41st Annual Meeting of the Association for Computational Linguistics
pdf
bib
Flexible Spoken Dialogue System based on User Models and Dynamic Generation of VoiceXML Scripts
Kazunori Komatani
|
Fumihiro Adachi
|
Shinichi Ueno
|
Tatsuya Kawahara
|
Hiroshi G. Okuno
Proceedings of the Fourth SIGdial Workshop of Discourse and Dialogue
2002
pdf
bib
Efficient Dialogue Strategy to Find Users’ Intended Items from Information Query Results
Kazunori Komatani
|
Tatsuya Kawahara
|
Ryosuke Ito
|
Hiroshi G. Okuno
COLING 2002: The 19th International Conference on Computational Linguistics
2000
pdf
bib
Flexible Mixed-Initiative Dialogue Management using Concept-Level Confidence Measures of Speech Recognizer Output
Kazunori Komatani
|
Tatsuya Kawahara
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics