Extracting relations between outcomes and significance levels in Randomized Controlled Trials (RCTs) publications

Anna Koroleva, Patrick Paroubek


Abstract
Randomized controlled trials assess the effects of an experimental intervention by comparing it to a control intervention with regard to some variables - trial outcomes. Statistical hypothesis testing is used to test if the experimental intervention is superior to the control. Statistical significance is typically reported for the measured outcomes and is an important characteristic of the results. We propose a machine learning approach to automatically extract reported outcomes, significance levels and the relation between them. We annotated a corpus of 663 sentences with 2,552 outcome - significance level relations (1,372 positive and 1,180 negative relations). We compared several classifiers, using a manually crafted feature set, and a number of deep learning models. The best performance (F-measure of 94%) was shown by the BioBERT fine-tuned model.
Anthology ID:
W19-5038
Volume:
Proceedings of the 18th BioNLP Workshop and Shared Task
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Dina Demner-Fushman, Kevin Bretonnel Cohen, Sophia Ananiadou, Junichi Tsujii
Venue:
BioNLP
SIG:
SIGBIOMED
Publisher:
Association for Computational Linguistics
Note:
Pages:
359–369
Language:
URL:
https://aclanthology.org/W19-5038/
DOI:
10.18653/v1/W19-5038
Bibkey:
Cite (ACL):
Anna Koroleva and Patrick Paroubek. 2019. Extracting relations between outcomes and significance levels in Randomized Controlled Trials (RCTs) publications. In Proceedings of the 18th BioNLP Workshop and Shared Task, pages 359–369, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Extracting relations between outcomes and significance levels in Randomized Controlled Trials (RCTs) publications (Koroleva & Paroubek, BioNLP 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-5038.pdf