WAX: A New Dataset for Word Association eXplanations

Chunhua Liu; Trevor Cohn; Simon De Deyne; Lea Frermann

doi:10.18653/v1/2022.aacl-main.9

WAX: A New Dataset for Word Association eXplanations

Chunhua Liu, Trevor Cohn, Simon De Deyne, Lea Frermann

Abstract

Word associations are among the most common paradigms to study the human mental lexicon. While their structure and types of associations have been well studied, surprisingly little attention has been given to the question of why participants produce the observed associations. Answering this question would not only advance understanding of human cognition, but could also aid machines in learning and representing basic commonsense knowledge. This paper introduces a large, crowd-sourced data set of English word associations with explanations, labeled with high-level relation types. We present an analysis of the provided explanations, and design several tasks to probe to what extent current pre-trained language models capture the underlying relations. Our experiments show that models struggle to capture the diversity of human associations, suggesting WAX is a rich benchmark for commonsense modeling and generation.

Anthology ID:: 2022.aacl-main.9
Volume:: Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Month:: November
Year:: 2022
Address:: Online only
Editors:: Yulan He, Heng Ji, Sujian Li, Yang Liu, Chua-Hui Chang
Venues:: AACL | IJCNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 106–120
Language:
URL:: https://aclanthology.org/2022.aacl-main.9/
DOI:: 10.18653/v1/2022.aacl-main.9
Bibkey:
Cite (ACL):: Chunhua Liu, Trevor Cohn, Simon De Deyne, and Lea Frermann. 2022. WAX: A New Dataset for Word Association eXplanations. In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 106–120, Online only. Association for Computational Linguistics.
Cite (Informal):: WAX: A New Dataset for Word Association eXplanations (Liu et al., AACL-IJCNLP 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.aacl-main.9.pdf
Dataset:: 2022.aacl-main.9.Dataset.zip

PDF Cite Search Dataset Fix data