Yoshinobu Kano


2024

pdf bib
Proceedings of the 2nd International AIWolfDial Workshop
Yoshinobu Kano
Proceedings of the 2nd International AIWolfDial Workshop

pdf bib
AIWolfDial 2024: Summary of Natural Language Division of 6th International AIWolf Contest
Yoshinobu Kano | Yuto Sahashi | Neo Watanabe | Kaito Kagaminuma | Claus Aranha | Daisuke Katagami | Kei Harada | Michimasa Inaba | Takeshi Ito | Hirotaka Osawa | Takashi Otsuki | Fujio Toriumi
Proceedings of the 2nd International AIWolfDial Workshop

We held our 6th annual AIWolf international contest to automatically play the Werewolf game “Mafia”, where players try finding liars via conversations, aiming at promoting developments in creating agents of more natural conversations in higher level, such as longer contexts, personal relationships, semantics, pragmatics, and logics, revealing the capabilities and limits of the generative AIs. In our Natural Language Division of the contest, we had eight Japanese speaking agent teams, and five English speaking agents, to mutually run games. By using the game logs, we performed human subjective evaluations, win rates, and detailed log analysis. We found that the entire system performance has largely improved over the previous year, due to the recent advantages of the LLMs. There are several new ideas to improve the way using LLMs such as the summarization, characterization, and the logics outside LLMs, etc. However, it is not perfect at all yet; the generated talks are sometimes inconsistent with the game actions. Our future work includes to reveal the capability of the LLMs, whether they can make the duality of the “liar”, in other words, holding a “true” and a “false” circumstances of the agent at the same time, even holding what these circumstances look like from other agents.

pdf bib
Text Generation Indistinguishable from Target Person by Prompting Few Examples Using LLM
Yuka Tsubota | Yoshinobu Kano
Proceedings of the 2nd International AIWolfDial Workshop

To achieve smooth and natural communication between a dialogue system and a human, it is necessary for the dialogue system to behave more human-like. Recreating the personality of an actual person can be an effective way for this purpose. This study proposes a method to recreate a personality by a large language model (generative AI) without training, but with prompt technique to make the creation cost as low as possible. Collecting a large amount of dialogue data from a specific person is not easy and requires a significant amount of time for training. Therefore, we aim to recreate the personality of a specific individual without using dialogue data. The personality referred to in this paper denotes the image of a person that can be determined solely from the input and output of text dialogues. As a result of the experiments, it was revealed that by using prompts combining profile information, responses to few questions, and extracted speaking characteristics from those responses, it is possible to improve the reproducibility of a specific individual’s personality.

pdf bib
Werewolf Game Agent by Generative AI Incorporating Logical Information Between Players
Neo Watanabe | Yoshinobu Kano
Proceedings of the 2nd International AIWolfDial Workshop

In recent years, AI models based on GPT have advanced rapidly. These models are capable of generating text, translating between different languages, and answering questions with high accuracy. However, the process behind their outputs remains a black box, making it difficult to ascertain the data influencing their responses. These AI models do not always produce accurate outputs and are known for generating incorrect information, known as hallucinations, whose causes are hard to pinpoint. Moreover, they still face challenges in solving complex problems that require step-by-step reasoning, despite various improvements like the Chain-of-Thought approach. There’s no guarantee that these models can independently perform logical reasoning from scratch, raising doubts about the reliability and accuracy of their inferences. To address these concerns, this study proposes the incorporation of an explicit logical structure into the AI’s text generation process. As a validation experiment, a text-based agent capable of playing the Werewolf game, which requires deductive reasoning, was developed using GPT-4. By comparing the model combined with an external explicit logical structure and a baseline that lacks such a structure, the proposed method demonstrated superior reasoning capabilities in subjective evaluations, suggesting the effectiveness of adding an explicit logical framework to the conventional AI models.

2023

pdf bib
AIWolfDial 2023: Summary of Natural Language Division of 5th International AIWolf Contest
Yoshinobu Kano | Neo Watanabe | Kaito Kagaminuma | Claus Aranha | Jaewon Lee | Benedek Hauer | Hisaichi Shibata | Soichiro Miki | Yuta Nakamura | Takuya Okubo | Soga Shigemura | Rei Ito | Kazuki Takashima | Tomoki Fukuda | Masahiro Wakutani | Tomoya Hatanaka | Mami Uchida | Mikio Abe | Akihiro Mikami | Takashi Otsuki | Zhiyang Qi | Kei Harada | Michimasa Inaba | Daisuke Katagami | Hirotaka Osawa | Fujio Toriumi
Proceedings of the 16th International Natural Language Generation Conference: Generation Challenges

We held our 5th annual AIWolf international contest to automatically play the Werewolf game “Mafia”, where players try finding liars via conversations, aiming at promoting developments in creating agents of more natural conversations in higher level, such as longer contexts, personal relationships, semantics, pragmatics, and logics, revealing the capabilities and limits of the generative AIs. In our Natural Language Division of the contest, we had six Japanese speaking agents from five teams, and three English speaking agents, to mutually run games. By using the game logs, We performed human subjective evaluations and detailed log analysis. We found that the entire system performance has largely improved over the previous year, due to the recent advantages of the LLMs. However, it is not perfect at all yet; the generated talks are sometimes inconsistent with the game actions, it is still doubtful that the agents could infer roles by logics rather than superficial utterance generations. It is not explicitly observed in this log but it would be still difficult to make an agent telling a lie, pretend as a villager but it has an opposite goal inside. Our future work includes to reveal the capability of the LLMs, whether they can make the duality of the “liar”, in other words, holding a “true” and a “false” circumstances of the agent at the same time, even holding what these circumstances look like from other agents.

pdf bib
Dialogue Response Generation Using Completion of Omitted Predicate Arguments Based on Zero Anaphora Resolution
Ayaka Ueyama | Yoshinobu Kano
Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue

Human conversation attempts to build common ground consisting of shared beliefs, knowledge, and perceptions that form the premise for understanding utterances. Recent deep learning-based dialogue systems use human dialogue data to train a mapping from a dialogue history to responses, but common ground not directly expressed in words makes it difficult to generate coherent responses by learning statistical patterns alone. We propose Dialogue Completion using Zero Anaphora Resolution (DCZAR), a framework that explicitly completes omitted information in the dialogue history and generates responses from the completed dialogue history. In this study, we conducted automatic and human evaluations by applying several pretraining methods and datasets in Japanese in various combinations. Experimental results show that the DCZAR framework contributes to the generation of more coherent and engaging responses.

2021

pdf bib
ZmBART: An Unsupervised Cross-lingual Transfer Framework for Language Generation
Kaushal Kumar Maurya | Maunendra Sankar Desarkar | Yoshinobu Kano | Kumari Deepshikha
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf bib
Diverse dialogue generation with context dependent dynamic loss function
Ayaka Ueyama | Yoshinobu Kano
Proceedings of the 28th International Conference on Computational Linguistics

Dialogue systems using deep learning have achieved generation of fluent response sentences to user utterances. Nevertheless, they tend to produce responses that are not diverse and which are less context-dependent. To address these shortcomings, we propose a new loss function, an Inverse N-gram loss (INF), which incorporates contextual fluency and diversity at the same time by a simple formula. Our INF loss can adjust its loss dynamically by a weight using the inverse frequency of the tokens’ n-gram applied to Softmax Cross-Entropy loss, so that rare tokens appear more likely while retaining the fluency of the generated sentences. We trained Transformer using English and Japanese Twitter replies as single-turn dialogues using different loss functions. Our INF loss model outperformed the baselines of SCE loss and ITF loss models in automatic evaluations such as DIST-N and ROUGE, and also achieved higher scores on our human evaluations of coherence and richness.

2019

pdf bib
Proceedings of the 1st International Workshop of AI Werewolf and Dialog System (AIWolfDial2019)
Yoshinobu Kano | Claus Aranha | Michimasa Inaba | Fujio Toriumi | Hirotaka Osawa | Daisuke Katagami | Takashi Otsuki
Proceedings of the 1st International Workshop of AI Werewolf and Dialog System (AIWolfDial2019)

pdf bib
Overview of AIWolfDial 2019 Shared Task: Contest of Automatic Dialog Agents to Play the Werewolf Game through Conversations
Yoshinobu Kano | Claus Aranha | Michimasa Inaba | Fujio Toriumi | Hirotaka Osawa | Daisuke Katagami | Takashi Otsuki | Issei Tsunoda | Shoji Nagayama | Dolça Tellols | Yu Sugawara | Yohei Nakata
Proceedings of the 1st International Workshop of AI Werewolf and Dialog System (AIWolfDial2019)

pdf bib
AI Werewolf Agent with Reasoning Using Role Patterns and Heuristics
Issei Tsunoda | Yoshinobu Kano
Proceedings of the 1st International Workshop of AI Werewolf and Dialog System (AIWolfDial2019)

pdf bib
Multiple News Headlines Generation using Page Metadata
Kango Iwama | Yoshinobu Kano
Proceedings of the 12th International Conference on Natural Language Generation

Multiple headlines of a newspaper article have an important role to express the content of the article accurately and concisely. A headline depends on the content and intent of their article. While a single headline expresses the whole corresponding article, each of multiple headlines expresses different information individually. We suggest automatic generation method of such a diverse multiple headlines in a newspaper. Our generation method is based on the Pointer-Generator Network, using page metadata on a newspaper which can change headline generation behavior. This page metadata includes headline location, headline size, article page number, etc. In a previous related work, ensemble of three different generation models was performed to obtain a single headline, where each generation model generates a single headline candidate. In contrast, we use a single model to generate multiple headlines. We conducted automatic evaluations for generated headlines. The results show that our method improved ROUGE-1 score by 4.32 points higher than baseline. These results suggest that our model using page metadata can generate various multiple headlines for an article In better performance.

2018

pdf bib
De-identifying Free Text of Japanese Dummy Electronic Health Records
Kohei Kajiyama | Hiromasa Horiguchi | Takashi Okumura | Mizuki Morita | Yoshinobu Kano
Proceedings of the Ninth International Workshop on Health Text Mining and Information Analysis

A new law was established in Japan to promote utilization of EHRs for research and developments, while de-identification is required to use EHRs. However, studies of automatic de-identification in the healthcare domain is not active for Japanese language, no de-identification tool available in practical performance for Japanese medical domains, as far as we know. Previous work shows that rule-based methods are still effective, while deep learning methods are reported to be better recently. In order to implement and evaluate a de-identification tool in a practical level, we implemented three methods, rule-based, CRF, and LSTM. We prepared three datasets of pseudo EHRs with de-identification tags manually annotated. These datasets are derived from shared task data to compare with previous work, and our new data to increase training data. Our result shows that our LSTM-based method is better and robust, which leads to our future work that plans to apply our system to actual de-identification tasks in hospitals.

pdf bib
Japanese Advertising Slogan Generator using Case Frame and Word Vector
Kango Iwama | Yoshinobu Kano
Proceedings of the 11th International Conference on Natural Language Generation

There has been many works published for automatic sentence generation of a variety of domains. However, there would be still no single method available at present that can generate sentences for all of domains. Each domain will require a suitable generation method. We focus on automatic generation of Japanese advertisement slogans in this paper. We use our advertisement slogan database, case frame information, and word vector information. We employed our system to apply for a copy competition for human copywriters, where our advertisement slogan was left as a finalist. Our system could be regarded as the world first system that generates slogans in a practical level, as an advertising agency already employs our system in their business.

2016

pdf bib
MedNLPDoc: Japanese Shared Task for Clinical NLP
Eiji Aramaki | Yoshinobu Kano | Tomoko Ohkuma | Mizuki Morita
Proceedings of the Clinical Natural Language Processing Workshop (ClinicalNLP)

Due to the recent replacements of physical documents with electronic medical records (EMR), the importance of information processing in medical fields has been increased. We have been organizing the MedNLP task series in NTCIR-10 and 11. These workshops were the first shared tasks which attempt to evaluate technologies that retrieve important information from medical reports written in Japanese. In this report, we describe the NTCIR-12 MedNLPDoc task which is designed for more advanced and practical use for the medical fields. This task is considered as a multi-labeling task to a patient record. This report presents results of the shared task, discusses and illustrates remained issues in the medical natural language processing field.

pdf bib
Inference of ICD Codes from Japanese Medical Records by Searching Disease Names
Masahito Sakishita | Yoshinobu Kano
Proceedings of the Clinical Natural Language Processing Workshop (ClinicalNLP)

Importance of utilizing medical information is getting increased as electronic health records (EHRs) are widely used nowadays. We aim to assign international standardized disease codes, ICD-10, to Japanese textual information in EHRs for users to reuse the information accurately. In this paper, we propose methods to automatically extract diagnosis and to assign ICD codes to Japanese medical records. Due to the lack of available training data, we dare employed rule-based methods rather than machine learning. We observed characteristics of medical records carefully, writing rules to make effective methods by hand. We applied our system to the NTCIR-12 MedNLPDoc shared task data where participants are required to assign ICD-10 codes of possible diagnosis in given EHRs. In this shared task, our system achieved the highest F-measure score among all participants in the most severe evaluation criteria. Through comparison with other approaches, we show that our approach could be a useful milestone for the future development of Japanese medical record processing.

pdf bib
Answering Yes-No Questions by Penalty Scoring in History Subjects of University Entrance Examinations
Yoshinobu Kano
Proceedings of the Open Knowledge Base and Question Answering Workshop (OKBQA 2016)

Answering yes–no questions is more difficult than simply retrieving ranked search results. To answer yes–no questions, especially when the correct answer is no, one must find an objectionable keyword that makes the question’s answer no. Existing systems, such as factoid-based ones, cannot answer yes–no questions very well because of insufficient handling of such objectionable keywords. We suggest an algorithm that answers yes–no questions by assigning an importance to objectionable keywords. Concretely speaking, we suggest a penalized scoring method that finds and makes lower score for parts of documents that include such objectionable keywords. We check a keyword distribution for each part of a document such as a paragraph, calculating the keyword density as a basic score. Then we use an objectionable keyword penalty when a keyword does not appear in a target part but appears in other parts of the document. Our algorithm is robust for open domain problems because it requires no training. We achieved 4.45 point better results in F1 scores than the best score of the NTCIR-10 RITE2 shared task, also obtained the best score in 2014 mock university examination challenge of the Todai Robot project.

pdf bib
Between Platform and APIs: Kachako API for Developers
Yoshinobu Kano
Proceedings of the Third International Workshop on Worldwide Language Service Infrastructure and Second Workshop on Open Infrastructures and Analysis Frameworks for Human Language Technologies (WLSI/OIAF4HLT2016)

Different types of users require different functions in NLP software. It is difficult for a single platform to cover all types of users. When a framework aims to provide more interoperability, users are required to learn more concepts; users’ application designs are restricted to be compliant with the framework. While an interoperability framework is useful in certain cases, some types of users will not select the framework due to the learning cost and design restrictions. We suggest a rather simple framework for the interoperability aiming at developers. Reusing an existing NLP platform Kachako, we created an API oriented NLP system. This system loosely couples rich high-end functions, including annotation visualizations, statistical evaluations, an-notation searching, etc. This API do not require users much learning cost, providing customization ability for power users while also allowing easy users to employ many GUI functions.

2013

pdf bib
Modeling Comma Placement in Chinese Text for Better Readability using Linguistic Features and Gaze Information
Tadayoshi Hara | Chen Chen | Yoshinobu Kano | Akiko Aizawa
Proceedings of the Second Workshop on Predicting and Improving Text Readability for Target Reader Populations

2012

pdf bib
Towards automation in using multi-modal language resources: compatibility and interoperability for multi-modal features in Kachako
Yoshinobu Kano
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Use of language resources including annotated corpora and tools is not easy for users, as it requires expert knowledge to determine which resources are compatible and interoperable. Sometimes it requires programming skill in addition to the expert knowledge to make the resources compatible and interoperable when the resources are not created so. If a platform system could provide automation features for using language resources, users do not have to waste their time as the above issues are not necessarily essential for the users' goals. While our system, Kachako, provides such automation features for single-modal resources, multi-modal resources are more difficult to combine automatically. In this paper, we discuss designs of multi-modal resource compatibility and interoperability from such an automation point of view in order for the Kachako system to provide automation features of multi-modal resources. Our discussion is based on the UIMA framework, and focuses on resource metadata description optimized for ideal automation features while harmonizing with the UIMA framework using other standards as well.

pdf bib
Predicting Word Fixations in Text with a CRF Model for Capturing General Reading Strategies among Readers
Tadayoshi Hara | Daichi Mochihashi | Yoshinobu Kano | Akiko Aizawa
Proceedings of the First Workshop on Eye-tracking and Natural Language Processing

2011

pdf bib
Promoting Interoperability of Resources in META-SHARE
Paul Thompson | Yoshinobu Kano | John McNaught | Steve Pettifer | Teresa Attwood | John Keane | Sophia Ananiadou
Proceedings of the Workshop on Language Resources, Technology and Services in the Sharing Paradigm

2010

pdf bib
U-Compare: An Integrated Language Resource Evaluation Platform Including a Comprehensive UIMA Resource Library
Yoshinobu Kano | Ruben Dorado | Luke McCrohon | Sophia Ananiadou | Jun’ichi Tsujii
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

Language resources, including corpus and tools, are normally required to be combined in order to achieve a user’s specific task. However, resources tend to be developed independently in different, incompatible formats. In this paper we describe about U-Compare, which consists of the U-Compare component repository and the U-Compare platform. We have been building a highly interoperable resource library, providing the world largest ready-to-use UIMA component repository including wide variety of corpus readers and state-of-the-art language tools. These resources can be deployed as local services or web services, even possible to be hosted in clustered machines to increase the performance, while users do not need to be aware of such differences. In addition to the resource library, an integrated language processing platform is provided, allowing workflow creation, comparison, evaluation and visualization, using the resources in the library or any UIMA component, without any programming via graphical user interfaces, while a command line launcher is also available without GUIs. The evaluation itself is processed in a UIMA component, users can create and plug their own evaluation metrics in addition to the predefined metrics. U-Compare has been successfully used in many projects including BioCreative, Conll and the BioNLP shared task.

2009

pdf bib
Overview of BioNLP’09 Shared Task on Event Extraction
Jin-Dong Kim | Tomoko Ohta | Sampo Pyysalo | Yoshinobu Kano | Jun’ichi Tsujii
Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task

pdf bib
Integrated NLP Evaluation System for Pluggable Evaluation Metrics with Extensive Interoperable Toolkit
Yoshinobu Kano | Luke McCrohon | Sophia Ananiadou | Jun’ichi Tsujii
Proceedings of the Workshop on Software Engineering, Testing, and Quality Assurance for Natural Language Processing (SETQA-NLP 2009)

2008

pdf bib
Towards Data and Goal Oriented Analysis: Tool Inter-operability and Combinatorial Comparison
Yoshinobu Kano | Ngan Nguyen | Rune Sætre | Kazuhiro Yoshida | Keiichiro Fukamachi | Yusuke Miyao | Yoshimasa Tsuruoka | Sophia Ananiadou | Jun’ichi Tsujii
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-II