A Dataset for Physical and Abstract Plausibility and Sources of Human Disagreement

Annerose Eichel, Sabine Schulte Im Walde


Abstract
We present a novel dataset for physical and abstract plausibility of events in English. Based on naturally occurring sentences extracted from Wikipedia, we infiltrate degrees of abstractness, and automatically generate perturbed pseudo-implausible events. We annotate a filtered and balanced subset for plausibility using crowd-sourcing, and perform extensive cleansing to ensure annotation quality. In-depth quantitative analyses indicate that annotators favor plausibility over implausibility and disagree more on implausible events. Furthermore, our plausibility dataset is the first to capture abstractness in events to the same extent as concreteness, and we find that event abstractness has an impact on plausibility ratings: more concrete event participants trigger a perception of implausibility.
Anthology ID:
2023.law-1.4
Volume:
Proceedings of the 17th Linguistic Annotation Workshop (LAW-XVII)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Jakob Prange, Annemarie Friedrich
Venue:
LAW
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
31–45
Language:
URL:
https://aclanthology.org/2023.law-1.4
DOI:
10.18653/v1/2023.law-1.4
Bibkey:
Cite (ACL):
Annerose Eichel and Sabine Schulte Im Walde. 2023. A Dataset for Physical and Abstract Plausibility and Sources of Human Disagreement. In Proceedings of the 17th Linguistic Annotation Workshop (LAW-XVII), pages 31–45, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
A Dataset for Physical and Abstract Plausibility and Sources of Human Disagreement (Eichel & Schulte Im Walde, LAW 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.law-1.4.pdf
Video:
 https://aclanthology.org/2023.law-1.4.mp4