Shurong Sheng


2016

pdf bib
A Dataset for Multimodal Question Answering in the Cultural Heritage Domain
Shurong Sheng | Luc Van Gool | Marie-Francine Moens
Proceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH)

Multimodal question answering in the cultural heritage domain allows visitors to ask questions in a more natural way and thus provides better user experiences with cultural objects while visiting a museum, landmark or any other historical site. In this paper, we introduce the construction of a golden standard dataset that will aid research of multimodal question answering in the cultural heritage domain. The dataset, which will be soon released to the public, contains multimodal content including images of typical artworks from the fascinating old-Egyptian Amarna period, related image-containing documents of the artworks and over 800 multimodal queries integrating visual and textual questions. The multimodal questions and related documents are all in English. The multimodal questions are linked to relevant paragraphs in the related documents that contain the answer to the multimodal query.