Kiyoaki Aikawa


2012

We describe the evaluation framework for spoken document retrieval for the IR for the Spoken Documents Task, conducted in the ninth NTCIR Workshop. The two parts of this task were a spoken term detection (STD) subtask and an ad hoc spoken document retrieval subtask (SDR). Both subtasks target search terms, passages and documents included in academic and simulated lectures of the Corpus of Spontaneous Japanese. Seven teams participated in the STD subtask and five in the SDR subtask. The results obtained through the evaluation in the workshop are discussed.

2008

The Spoken Document Processing Working Group, which is part of the special interest group of spoken language processing of the Information Processing Society of Japan, is developing a test collection for evaluation of spoken document retrieval systems. A prototype of the test collection consists of a set of textual queries, relevant segment lists, and transcriptions by an automatic speech recognition system, allowing retrieval from the Corpus of Spontaneous Japanese (CSJ). From about 100 initial queries, application of the criteria that a query should have more than five relevant segments that consist of about one minute speech segments yielded 39 queries. Targeting the test collection, an ad hoc retrieval experiment was also conducted to assess the baseline retrieval performance by applying a standard method for spoken document retrieval.

2003

2001

2000