Using Sparse Semantic Embeddings Learned from Multimodal Text and Image Data to Model Human Conceptual Knowledge

Steven Derby, Paul Miller, Brian Murphy, Barry Devereux


Abstract
Distributional models provide a convenient way to model semantics using dense embedding spaces derived from unsupervised learning algorithms. However, the dimensions of dense embedding spaces are not designed to resemble human semantic knowledge. Moreover, embeddings are often built from a single source of information (typically text data), even though neurocognitive research suggests that semantics is deeply linked to both language and perception. In this paper, we combine multimodal information from both text and image-based representations derived from state-of-the-art distributional models to produce sparse, interpretable vectors using Joint Non-Negative Sparse Embedding. Through in-depth analyses comparing these sparse models to human-derived behavioural and neuroimaging data, we demonstrate their ability to predict interpretable linguistic descriptions of human ground-truth semantic knowledge.
Anthology ID:
K18-1026
Volume:
Proceedings of the 22nd Conference on Computational Natural Language Learning
Month:
October
Year:
2018
Address:
Brussels, Belgium
Editors:
Anna Korhonen, Ivan Titov
Venue:
CoNLL
SIG:
SIGNLL
Publisher:
Association for Computational Linguistics
Note:
Pages:
260–270
Language:
URL:
https://aclanthology.org/K18-1026
DOI:
10.18653/v1/K18-1026
Bibkey:
Cite (ACL):
Steven Derby, Paul Miller, Brian Murphy, and Barry Devereux. 2018. Using Sparse Semantic Embeddings Learned from Multimodal Text and Image Data to Model Human Conceptual Knowledge. In Proceedings of the 22nd Conference on Computational Natural Language Learning, pages 260–270, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Using Sparse Semantic Embeddings Learned from Multimodal Text and Image Data to Model Human Conceptual Knowledge (Derby et al., CoNLL 2018)
Copy Citation:
PDF:
https://aclanthology.org/K18-1026.pdf
Data
ImageNet