Measuring Sentence-Level and Aspect-Level (Un)certainty in Science Communications

Jiaxin Pei, David Jurgens


Abstract
Certainty and uncertainty are fundamental to science communication. Hedges have widely been used as proxies for uncertainty. However, certainty is a complex construct, with authors expressing not only the degree but the type and aspects of uncertainty in order to give the reader a certain impression of what is known. Here, we introduce a new study of certainty that models both the level and the aspects of certainty in scientific findings. Using a new dataset of 2167 annotated scientific findings, we demonstrate that hedges alone account for only a partial explanation of certainty. We show that both the overall certainty and individual aspects can be predicted with pre-trained language models, providing a more complete picture of the author’s intended communication. Downstream analyses on 431K scientific findings from news and scientific abstracts demonstrate that modeling sentence-level and aspect-level certainty is meaningful for areas like science communication. Both the model and datasets used in this paper are released at https://blablablab.si.umich.edu/projects/certainty/.
Anthology ID:
2021.emnlp-main.784
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9959–10011
Language:
URL:
https://aclanthology.org/2021.emnlp-main.784
DOI:
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2021.emnlp-main.784.pdf