Integrating Vision and Language Datasets to Measure Word Concreteness

Gitit Kehat, James Pustejovsky


Abstract
We present and take advantage of the inherent visualizability properties of words in visual corpora (the textual components of vision-language datasets) to compute concreteness scores for words. Our simple method does not require hand-annotated concreteness score lists for training, and yields state-of-the-art results when evaluated against concreteness scores lists and previously derived scores, as well as when used for metaphor detection.
Anthology ID:
I17-2018
Volume:
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
Month:
November
Year:
2017
Address:
Taipei, Taiwan
Editors:
Greg Kondrak, Taro Watanabe
Venue:
IJCNLP
SIG:
Publisher:
Asian Federation of Natural Language Processing
Note:
Pages:
103–108
Language:
URL:
https://aclanthology.org/I17-2018
DOI:
Bibkey:
Cite (ACL):
Gitit Kehat and James Pustejovsky. 2017. Integrating Vision and Language Datasets to Measure Word Concreteness. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 103–108, Taipei, Taiwan. Asian Federation of Natural Language Processing.
Cite (Informal):
Integrating Vision and Language Datasets to Measure Word Concreteness (Kehat & Pustejovsky, IJCNLP 2017)
Copy Citation:
PDF:
https://aclanthology.org/I17-2018.pdf
Data
Flickr30kMS COCOVisual Genome