Semi-Supervised Exaggeration Detection of Health Science Press Releases

Dustin Wright, Isabelle Augenstein


Abstract
Public trust in science depends on honest and factual communication of scientific papers. However, recent studies have demonstrated a tendency of news media to misrepresent scientific papers by exaggerating their findings. Given this, we present a formalization of and study into the problem of exaggeration detection in science communication. While there are an abundance of scientific papers and popular media articles written about them, very rarely do the articles include a direct link to the original paper, making data collection challenging, and necessitating the need for few-shot learning. We address this by curating a set of labeled press release/abstract pairs from existing expert annotated studies on exaggeration in press releases of scientific papers suitable for benchmarking the performance of machine learning models on the task. Using limited data from this and previous studies on exaggeration detection in science, we introduce MT-PET, a multi-task version of Pattern Exploiting Training (PET), which leverages knowledge from complementary cloze-style QA tasks to improve few-shot learning. We demonstrate that MT-PET outperforms PET and supervised learning both when data is limited, as well as when there is an abundance of data for the main task.
Anthology ID:
2021.emnlp-main.845
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10824–10836
Language:
URL:
https://aclanthology.org/2021.emnlp-main.845
DOI:
10.18653/v1/2021.emnlp-main.845
Bibkey:
Cite (ACL):
Dustin Wright and Isabelle Augenstein. 2021. Semi-Supervised Exaggeration Detection of Health Science Press Releases. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 10824–10836, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Semi-Supervised Exaggeration Detection of Health Science Press Releases (Wright & Augenstein, EMNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.emnlp-main.845.pdf
Video:
 https://aclanthology.org/2021.emnlp-main.845.mp4
Code
 copenlu/scientific-exaggeration-detection