FAST: A carefully sampled and cognitively motivated dataset for distributional semantic evaluation

Stefan Evert, Gabriella Lapesa


Abstract
What is the first word that comes to your mind when you hear giraffe, or damsel, or freedom? Such free associations contain a huge amount of information on the mental representations of the corresponding concepts, and are thus an extremely valuable testbed for the evaluation of semantic representations extracted from corpora. In this paper, we present FAST (Free ASsociation Tasks), a free association dataset for English rigorously sampled from two standard free association norms collections (the Edinburgh Associative Thesaurus and the University of South Florida Free Association Norms), discuss two evaluation tasks, and provide baseline results. In parallel, we discuss methodological considerations concerning the desiderata for a proper evaluation of semantic representations.
Anthology ID:
2021.conll-1.46
Volume:
Proceedings of the 25th Conference on Computational Natural Language Learning
Month:
November
Year:
2021
Address:
Online
Venues:
CoNLL | EMNLP
SIG:
SIGNLL
Publisher:
Association for Computational Linguistics
Note:
Pages:
588–595
Language:
URL:
https://aclanthology.org/2021.conll-1.46
DOI:
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2021.conll-1.46.pdf