2020
pdf
bib
abs
Unsupervised Domain Adaptation of Language Models for Reading Comprehension
Kosuke Nishida
|
Kyosuke Nishida
|
Itsumi Saito
|
Hisako Asano
|
Junji Tomita
Proceedings of the Twelfth Language Resources and Evaluation Conference
This study tackles unsupervised domain adaptation of reading comprehension (UDARC). Reading comprehension (RC) is a task to learn the capability for question answering with textual sources. State-of-the-art models on RC still do not have general linguistic intelligence; i.e., their accuracy worsens for out-domain datasets that are not used in the training. We hypothesize that this discrepancy is caused by a lack of the language modeling (LM) capability for the out-domain. The UDARC task allows models to use supervised RC training data in the source domain and only unlabeled passages in the target domain. To solve the UDARC problem, we provide two domain adaptation models. The first one learns the out-domain LM and in-domain RC task sequentially. The second one is the proposed model that uses a multi-task learning approach of LM and RC. The models can retain both the RC capability acquired from the supervised data in the source domain and the LM capability from the unlabeled data in the target domain. We evaluated the models on UDARC with five datasets in different domains. The models outperformed the model without domain adaptation. In particular, the proposed model yielded an improvement of 4.3/4.2 points in EM/F1 in an unseen biomedical domain.
2019
pdf
bib
abs
Multi-style Generative Reading Comprehension
Kyosuke Nishida
|
Itsumi Saito
|
Kosuke Nishida
|
Kazutoshi Shinoda
|
Atsushi Otsuka
|
Hisako Asano
|
Junji Tomita
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
This study tackles generative reading comprehension (RC), which consists of answering questions based on textual evidence and natural language generation (NLG). We propose a multi-style abstractive summarization model for question answering, called Masque. The proposed model has two key characteristics. First, unlike most studies on RC that have focused on extracting an answer span from the provided passages, our model instead focuses on generating a summary from the question and multiple passages. This serves to cover various answer styles required for real-world applications. Second, whereas previous studies built a specific model for each answer style because of the difficulty of acquiring one general model, our approach learns multi-style answers within a model to improve the NLG capability for all styles involved. This also enables our model to give an answer in the target style. Experiments show that our model achieves state-of-the-art performance on the Q&A task and the Q&A + NLG task of MS MARCO 2.1 and the summary task of NarrativeQA. We observe that the transfer of the style-independent NLG capability to the target style is the key to its success.
pdf
bib
abs
Answering while Summarizing: Multi-task Learning for Multi-hop QA with Evidence Extraction
Kosuke Nishida
|
Kyosuke Nishida
|
Masaaki Nagata
|
Atsushi Otsuka
|
Itsumi Saito
|
Hisako Asano
|
Junji Tomita
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Question answering (QA) using textual sources for purposes such as reading comprehension (RC) has attracted much attention. This study focuses on the task of explainable multi-hop QA, which requires the system to return the answer with evidence sentences by reasoning and gathering disjoint pieces of the reference texts. It proposes the Query Focused Extractor (QFE) model for evidence extraction and uses multi-task learning with the QA model. QFE is inspired by extractive summarization models; compared with the existing method, which extracts each evidence sentence independently, it sequentially extracts evidence sentences by using an RNN with an attention mechanism on the question sentence. It enables QFE to consider the dependency among the evidence sentences and cover important information in the question sentence. Experimental results show that QFE with a simple RC baseline model achieves a state-of-the-art evidence extraction score on HotpotQA. Although designed for RC, it also achieves a state-of-the-art evidence extraction score on FEVER, which is a recognizing textual entailment task on a large textual database.
pdf
bib
abs
A Simple but Effective Method to Incorporate Multi-turn Context with BERT for Conversational Machine Comprehension
Yasuhito Ohsugi
|
Itsumi Saito
|
Kyosuke Nishida
|
Hisako Asano
|
Junji Tomita
Proceedings of the First Workshop on NLP for Conversational AI
Conversational machine comprehension (CMC) requires understanding the context of multi-turn dialogue. Using BERT, a pretraining language model, has been successful for single-turn machine comprehension, while modeling multiple turns of question answering with BERT has not been established because BERT has a limit on the number and the length of input sequences. In this paper, we propose a simple but effective method with BERT for CMC. Our method uses BERT to encode a paragraph independently conditioned with each question and each answer in a multi-turn context. Then, the method predicts an answer on the basis of the paragraph representations encoded with BERT. The experiments with representative CMC datasets, QuAC and CoQA, show that our method outperformed recently published methods (+0.8 F1 on QuAC and +2.1 F1 on CoQA). In addition, we conducted a detailed analysis of the effects of the number and types of dialogue history on the accuracy of CMC, and we found that the gold answer history, which may not be given in an actual conversation, contributed to the model performance most on both datasets.
2018
pdf
bib
abs
Commonsense Knowledge Base Completion and Generation
Itsumi Saito
|
Kyosuke Nishida
|
Hisako Asano
|
Junji Tomita
Proceedings of the 22nd Conference on Computational Natural Language Learning
This study focuses on acquisition of commonsense knowledge. A previous study proposed a commonsense knowledge base completion (CKB completion) method that predicts a confidence score of for triplet-style knowledge for improving the coverage of CKBs. To improve the accuracy of CKB completion and expand the size of CKBs, we formulate a new commonsense knowledge base generation task (CKB generation) and propose a joint learning method that incorporates both CKB completion and CKB generation. Experimental results show that the joint learning method improved completion accuracy and the generation model created reasonable knowledge. Our generation model could also be used to augment data and improve the accuracy of completion.
pdf
bib
abs
Natural Language Inference with Definition Embedding Considering Context On the Fly
Kosuke Nishida
|
Kyosuke Nishida
|
Hisako Asano
|
Junji Tomita
Proceedings of the Third Workshop on Representation Learning for NLP
Natural language inference (NLI) is one of the most important tasks in NLP. In this study, we propose a novel method using word dictionaries, which are pairs of a word and its definition, as external knowledge. Our neural definition embedding mechanism encodes input sentences with the definitions of each word of the sentences on the fly. It can encode the definition of words considering the context of input sentences by using an attention mechanism. We evaluated our method using WordNet as a dictionary and confirmed that our method performed better than baseline models when using the full or a subset of 100d GloVe as word embeddings.
2016
pdf
bib
abs
Name Translation based on Fine-grained Named Entity Recognition in a Single Language
Kugatsu Sadamitsu
|
Itsumi Saito
|
Taichi Katayama
|
Hisako Asano
|
Yoshihiro Matsuo
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
We propose named entity abstraction methods with fine-grained named entity labels for improving statistical machine translation (SMT). The methods are based on a bilingual named entity recognizer that uses a monolingual named entity recognizer with transliteration. Through experiments, we demonstrate that incorporating fine-grained named entities into statistical machine translation improves the accuracy of SMT with more adequate granularity compared with the standard SMT, which is a non-named entity abstraction method.
pdf
bib
abs
A Hierarchical Neural Network for Information Extraction of Product Attribute and Condition Sentences
Yukinori Homma
|
Kugatsu Sadamitsu
|
Kyosuke Nishida
|
Ryuichiro Higashinaka
|
Hisako Asano
|
Yoshihiro Matsuo
Proceedings of the Open Knowledge Base and Question Answering Workshop (OKBQA 2016)
This paper describes a hierarchical neural network we propose for sentence classification to extract product information from product documents. The network classifies each sentence in a document into attribute and condition classes on the basis of word sequences and sentence sequences in the document. Experimental results showed the method using the proposed network significantly outperformed baseline methods by taking semantic representation of word and sentence sequential data into account. We also evaluated the network with two different product domains (insurance and tourism domains) and found that it was effective for both the domains.
2014
pdf
bib
Morphological Analysis for Japanese Noisy Text based on Character-level and Word-level Normalization
Itsumi Saito
|
Kugatsu Sadamitsu
|
Hisako Asano
|
Yoshihiro Matsuo
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers
pdf
bib
abs
Constructing a Corpus of Japanese Predicate Phrases for Synonym/Antonym Relations
Tomoko Izumi
|
Tomohide Shibata
|
Hisako Asano
|
Yoshihiro Matsuo
|
Sadao Kurohashi
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
We construct a large corpus of Japanese predicate phrases for synonym-antonym relations. The corpus consists of 7,278 pairs of predicates such as receive-permission (ACC) vs. obtain-permission (ACC), in which each predicate pair is accompanied by a noun phrase and case information. The relations are categorized as synonyms, entailment, antonyms, or unrelated. Antonyms are further categorized into three different classes depending on their aspect of oppositeness. Using the data as a training corpus, we conduct the supervised binary classification of synonymous predicates based on linguistically-motivated features. Combining features that are characteristic of synonymous predicates with those that are characteristic of antonymous predicates, we succeed in automatically identifying synonymous predicates at the high F-score of 0.92, a 0.4 improvement over the baseline method of using the Japanese WordNet. The results of an experiment confirm that the quality of the corpus is high enough to achieve automatic classification. To the best of our knowledge, this is the first and the largest publicly available corpus of Japanese predicate phrases for synonym-antonym relations.
2010
pdf
bib
Recognizing Relation Expression between Named Entities based on Inherent and Context-dependent Features of Relational words
Toru Hirano
|
Hisako Asano
|
Yoshihiro Matsuo
|
Genichiro Kikui
Coling 2010: Posters