DIDEC: The Dutch Image Description and Eye-tracking Corpus

Emiel van Miltenburg, Ákos Kádár, Ruud Koolen, Emiel Krahmer


Abstract
We present a corpus of spoken Dutch image descriptions, paired with two sets of eye-tracking data: Free viewing, where participants look at images without any particular purpose, and Description viewing, where we track eye movements while participants produce spoken descriptions of the images they are viewing. This paper describes the data collection procedure and the corpus itself, and provides an initial analysis of self-corrections in image descriptions. We also present two studies showing the potential of this data. Though these studies mainly serve as an example, we do find two interesting results: (1) the eye-tracking data for the description viewing task is more coherent than for the free-viewing task; (2) variation in image descriptions (also called ‘image specificity’; Jas and Parikh, 2015) is only moderately correlated across different languages. Our corpus can be used to gain a deeper understanding of the image description task, particularly how visual attention is correlated with the image description process.
Anthology ID:
C18-1310
Volume:
Proceedings of the 27th International Conference on Computational Linguistics
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, USA
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3658–3669
Language:
URL:
https://aclanthology.org/C18-1310
DOI:
Bibkey:
Cite (ACL):
Emiel van Miltenburg, Ákos Kádár, Ruud Koolen, and Emiel Krahmer. 2018. DIDEC: The Dutch Image Description and Eye-tracking Corpus. In Proceedings of the 27th International Conference on Computational Linguistics, pages 3658–3669, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
DIDEC: The Dutch Image Description and Eye-tracking Corpus (van Miltenburg et al., COLING 2018)
Copy Citation:
PDF:
https://aclanthology.org/C18-1310.pdf
Data
COCOFlickr30kSALICONVisual Genome