comp-syn: Perceptually Grounded Word Embeddings with Color

Bhargav Srinivasa Desikan, Tasker Hull, Ethan Nadler, Douglas Guilbeault, Aabir Abubakar Kar, Mark Chu, Donald Ruggiero Lo Sardo


Abstract
Popular approaches to natural language processing create word embeddings based on textual co-occurrence patterns, but often ignore embodied, sensory aspects of language. Here, we introduce the Python package comp-syn, which provides grounded word embeddings based on the perceptually uniform color distributions of Google Image search results. We demonstrate that comp-syn significantly enriches models of distributional semantics. In particular, we show that(1) comp-syn predicts human judgments of word concreteness with greater accuracy and in a more interpretable fashion than word2vec using low-dimensional word–color embeddings ,and (2) comp-syn performs comparably to word2vec on a metaphorical vs. literal word-pair classification task. comp-syn is open-source on PyPi and is compatible with mainstream machine-learning Python packages. Our package release includes word–color embeddings forover 40,000 English words, each associated with crowd-sourced word concreteness judgments.
Anthology ID:
2020.coling-main.154
Volume:
Proceedings of the 28th International Conference on Computational Linguistics
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Editors:
Donia Scott, Nuria Bel, Chengqing Zong
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
1744–1751
Language:
URL:
https://aclanthology.org/2020.coling-main.154
DOI:
10.18653/v1/2020.coling-main.154
Bibkey:
Cite (ACL):
Bhargav Srinivasa Desikan, Tasker Hull, Ethan Nadler, Douglas Guilbeault, Aabir Abubakar Kar, Mark Chu, and Donald Ruggiero Lo Sardo. 2020. comp-syn: Perceptually Grounded Word Embeddings with Color. In Proceedings of the 28th International Conference on Computational Linguistics, pages 1744–1751, Barcelona, Spain (Online). International Committee on Computational Linguistics.
Cite (Informal):
comp-syn: Perceptually Grounded Word Embeddings with Color (Srinivasa Desikan et al., COLING 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.coling-main.154.pdf
Code
 comp-syn/comp-syn