Developing Dataset of Japanese Slot Filling Quizzes Designed for Evaluation of Machine Reading Comprehension
Takuto Watarai | Masatoshi Tsuchiya
Proceedings of the Twelfth Language Resources and Evaluation Conference
This paper describes our developing dataset of Japanese slot filling quizzes designed for evaluation of machine reading comprehension. The dataset consists of quizzes automatically generated from Aozora Bunko, and each quiz is defined as a 4-tuple: a context passage, a query holding a slot, an answer character and a set of possible answer characters. The query is generated from the original sentence, which appears immediately after the context passage on the target book, by replacing the answer character into the slot. The set of possible answer characters consists of the answer character and the other characters who appear in the context passage. Because the context passage and the query shares the same context, a machine which precisely understand the context may select the correct answer from the set of possible answer characters. The unique point of our approach is that we focus on characters of target books as slots to generate queries from original sentences, because they play important roles in narrative texts and precise understanding their relationship is necessary for reading comprehension. To extract characters from target books, manually created dictionaries of characters are employed because some characters appear as common nouns not as named entities.