CogniVal: A Framework for Cognitive Word Embedding Evaluation

Nora Hollenstein, Antonio de la Torre, Nicolas Langer, Ce Zhang


Abstract
An interesting method of evaluating word representations is by how much they reflect the semantic representations in the human brain. However, most, if not all, previous works only focus on small datasets and a single modality. In this paper, we present the first multi-modal framework for evaluating English word representations based on cognitive lexical semantics. Six types of word embeddings are evaluated by fitting them to 15 datasets of eye-tracking, EEG and fMRI signals recorded during language processing. To achieve a global score over all evaluation hypotheses, we apply statistical significance testing accounting for the multiple comparisons problem. This framework is easily extensible and available to include other intrinsic and extrinsic evaluation methods. We find strong correlations in the results between cognitive datasets, across recording modalities and to their performance on extrinsic NLP tasks.
Anthology ID:
K19-1050
Volume:
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)
Month:
November
Year:
2019
Address:
Hong Kong, China
Venue:
CoNLL
SIG:
SIGNLL
Publisher:
Association for Computational Linguistics
Note:
Pages:
538–549
Language:
URL:
https://aclanthology.org/K19-1050
DOI:
10.18653/v1/K19-1050
Bibkey:
Cite (ACL):
Nora Hollenstein, Antonio de la Torre, Nicolas Langer, and Ce Zhang. 2019. CogniVal: A Framework for Cognitive Word Embedding Evaluation. In Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL), pages 538–549, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):
CogniVal: A Framework for Cognitive Word Embedding Evaluation (Hollenstein et al., CoNLL 2019)
Copy Citation:
PDF:
https://aclanthology.org/K19-1050.pdf
Supplementary material:
 K19-1050.Supplementary_Material.zip
Attachment:
 K19-1050.Attachment.zip
Code
 DS3Lab/cognival
Data
CoNLL-2003SQuAD