CA-EHN: Commonsense Analogy from E-HowNet

Peng-Hsuan Li; Tsan-Yu Yang; Wei-Yun Ma

CA-EHN: Commonsense Analogy from E-HowNet

Abstract

Embedding commonsense knowledge is crucial for end-to-end models to generalize inference beyond training corpora. However, existing word analogy datasets have tended to be handcrafted, involving permutations of hundreds of words with only dozens of pre-defined relations, mostly morphological relations and named entities. In this work, we model commonsense knowledge down to word-level analogical reasoning by leveraging E-HowNet, an ontology that annotates 88K Chinese words with their structured sense definitions and English translations. We present CA-EHN, the first commonsense word analogy dataset containing 90,505 analogies covering 5,656 words and 763 relations. Experiments show that CA-EHN stands out as a great indicator of how well word representations embed commonsense knowledge. The dataset is publicly available at https://github.com/ckiplab/CA-EHN.

Anthology ID:: 2020.lrec-1.365
Volume:: Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:: May
Year:: 2020
Address:: Marseille, France
Editors:: Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:: LREC
SIG:
Publisher:: European Language Resources Association
Note:
Pages:: 2984–2990
Language:: English
URL:: https://aclanthology.org/2020.lrec-1.365/
DOI:
Bibkey:
Cite (ACL):: Peng-Hsuan Li, Tsan-Yu Yang, and Wei-Yun Ma. 2020. CA-EHN: Commonsense Analogy from E-HowNet. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 2984–2990, Marseille, France. European Language Resources Association.
Cite (Informal):: CA-EHN: Commonsense Analogy from E-HowNet (Li et al., LREC 2020)
Copy Citation:
PDF:: https://aclanthology.org/2020.lrec-1.365.pdf

PDF Cite Search Fix data