Seiya Kawano

2025

pdf bib
MK2 at PBIG Competition: A Prompt Generation Solution
Yuzheng Xu | Tosho Hirasawa | Seiya Kawano | Shota Kato | Tadashi Kozuno
Proceedings of the 2nd Workshop on Agent AI for Scenario Planning

pdf bib abs
Pragmatic Theories Enhance Understanding of Implied Meanings in LLMs
Takuma Sato | Seiya Kawano | Koichiro Yoshino
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics

The ability to accurately interpret implied meanings plays a crucial role in human communication and language use, and language models are also expected to possess this capability. This study demonstrates that providing language models with pragmatic theories as prompts is an effective in-context learning approach for tasks to understand implied meanings. Specifically, we propose an approach in which an overview of pragmatic theories, such as Gricean pragmatics and Relevance Theory, is presented as a prompt to the language model, guiding it through a step-by-step reasoning process to derive a final interpretation. Experimental results showed that, compared to the baseline, which prompts intermediate reasoning without presenting pragmatic theories (0-shot Chain-of-Thought), our methods enabled language models to achieve up to 9.6% higher scores on pragmatic reasoning tasks. Furthermore, we show that even without explaining the details of pragmatic theories, merely mentioning their names in the prompt leads to a certain performance improvement (around 1-3%) in larger models compared to the baseline.

pdf bib abs
Multi-step or Direct: A Proactive Home-Assistant System Based on Commonsense Reasoning
Konosuke Yamasaki | Shohei Tanaka | Akishige Yuguchi | Seiya Kawano | Koichiro Yoshino
Proceedings of the 26th Annual Meeting of the Special Interest Group on Discourse and Dialogue

There is a growing expectation for the realization of proactive home-assistant robots that can assist users in their daily lives. It is essential to develop a framework that closely observes the user’s surrounding context, selectively extracts relevant information, and infers the user’s needs to proactively propose appropriate assistance. In this study, we first extend the Do-I-Demand dataset to define expected proactive assistance actions in domestic situations, where users make ambiguous utterances. These behaviors were defined based on common patterns of support that a majority of users would expect from a robot. We subsequently constructed a framework that infers users’ expected assistance actions from ambiguous utterances through commonsense reasoning. We explored two approaches: (1) multi-step reasoning using COMET as a commonsense reasoning engine, and (2) direct reasoning using large language models. Our experimental results suggest that both the multi-step and direct reasoning methods can successfully derive necessary assistance actions even when dealing with ambiguous user utterances.

2024

pdf bib abs
A Gaze-grounded Visual Question Answering Dataset for Clarifying Ambiguous Japanese Questions
Shun Inadumi | Seiya Kawano | Akishige Yuguchi | Yasutomo Kawanishi | Koichiro Yoshino
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Situated conversations, which refer to visual information as visual question answering (VQA), often contain ambiguities caused by reliance on directive information. This problem is exacerbated because some languages, such as Japanese, often omit subjective or objective terms. Such ambiguities in questions are often clarified by the contexts in conversational situations, such as joint attention with a user or user gaze information. In this study, we propose the Gaze-grounded VQA dataset (GazeVQA) that clarifies ambiguous questions using gaze information by focusing on a clarification process complemented by gaze information. We also propose a method that utilizes gaze target estimation results to improve the accuracy of GazeVQA tasks. Our experimental results showed that the proposed method improved the performance in some cases of a VQA system on GazeVQA and identified some typical problems of GazeVQA tasks that need to be improved.

pdf bib abs
J-CRe3: A Japanese Conversation Dataset for Real-world Reference Resolution
Nobuhiro Ueda | Hideko Habe | Akishige Yuguchi | Seiya Kawano | Yasutomo Kawanishi | Sadao Kurohashi | Koichiro Yoshino
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Understanding expressions that refer to the physical world is crucial for such human-assisting systems in the real world, as robots that must perform actions that are expected by users. In real-world reference resolution, a system must ground the verbal information that appears in user interactions to the visual information observed in egocentric views. To this end, we propose a multimodal reference resolution task and construct a Japanese Conversation dataset for Real-world Reference Resolution (J-CRe3). Our dataset contains egocentric video and dialogue audio of real-world conversations between two people acting as a master and an assistant robot at home. The dataset is annotated with crossmodal tags between phrases in the utterances and the object bounding boxes in the video frames. These tags include indirect reference relations, such as predicate-argument structures and bridging references as well as direct reference relations. We also constructed an experimental model and clarified the challenges in multimodal reference resolution tasks.

2023

pdf bib abs
Analysis of Style-Shifting on Social Media: Using Neural Language Model Conditioned by Social Meanings
Seiya Kawano | Shota Kanezaki | Angel Fernando Garcia Contreras | Akishige Yuguchi | Marie Katsurai | Koichiro Yoshino
Findings of the Association for Computational Linguistics: EMNLP 2023

In this paper, we propose a novel framework for evaluating style-shifting in social media conversations. Our proposed framework captures changes in an individual’s conversational style based on surprisals predicted by a personalized neural language model for individuals. Our personalized language model integrates not only the linguistic contents of conversations but also non-linguistic factors, such as social meanings, including group membership, personal attributes, and individual beliefs. We incorporate these factors directly or implicitly into our model, leveraging large, pre-trained language models and feature vectors derived from a relationship graph on social media. Compared to existing models, our personalized language model demonstrated superior performance in predicting an individual’s language in a test set. Furthermore, an analysis of style-shifting utilizing our proposed metric based on our personalized neural language model reveals a correlation between our metric and various conversation factors as well as human evaluation of style-shifting.

2022

pdf bib abs
Pseudo Ambiguous and Clarifying Questions Based on Sentence Structures Toward Clarifying Question Answering System
Yuya Nakano | Seiya Kawano | Koichiro Yoshino | Katsuhito Sudoh | Satoshi Nakamura
Proceedings of the Second DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering

Question answering (QA) with disambiguation questions is essential for practical QA systems because user questions often do not contain information enough to find their answers. We call this task clarifying question answering, a task to find answers to ambiguous user questions by disambiguating their intents through interactions. There are two major problems in building a clarifying question answering system: data preparation of possible ambiguous questions and the generation of clarifying questions. In this paper, we tackle these problems by sentence generation methods using sentence structures. Ambiguous questions are generated by eliminating a part of a sentence considering the sentence structure. Clarifying the question generation method based on case frame dictionary and sentence structure is also proposed. Our experimental results verify that our pseudo ambiguous question generation successfully adds ambiguity to questions. Moreover, the proposed clarifying question generation recovers the performance drop by asking the user for missing information.

2019

pdf bib abs
Neural Conversation Model Controllable by Given Dialogue Act Based on Adversarial Learning and Label-aware Objective
Seiya Kawano | Koichiro Yoshino | Satoshi Nakamura
Proceedings of the 12th International Conference on Natural Language Generation

Building a controllable neural conversation model (NCM) is an important task. In this paper, we focus on controlling the responses of NCMs by using dialogue act labels of responses as conditions. We introduce an adversarial learning framework for the task of generating conditional responses with a new objective to a discriminator, which explicitly distinguishes sentences by using labels. This change strongly encourages the generation of label-conditioned sentences. We compared the proposed method with some existing methods for generating conditional responses. The experimental results show that our proposed method has higher controllability for dialogue acts even though it has higher or comparable naturalness to existing methods.

Co-authors

Venues

ws1