%0 Conference Proceedings %T DISTANT-CTO: A Zero Cost, Distantly Supervised Approach to Improve Low-Resource Entity Extraction Using Clinical Trials Literature %A Dhrangadhariya, Anjani %A Müller, Henning %Y Demner-Fushman, Dina %Y Cohen, Kevin Bretonnel %Y Ananiadou, Sophia %Y Tsujii, Junichi %S Proceedings of the 21st Workshop on Biomedical Language Processing %D 2022 %8 May %I Association for Computational Linguistics %C Dublin, Ireland %F dhrangadhariya-muller-2022-distant %X PICO recognition is an information extraction task for identifying participant, intervention, comparator, and outcome information from clinical literature. Manually identifying PICO information is the most time-consuming step for conducting systematic reviews (SR), which is already labor-intensive. A lack of diversified and large, annotated corpora restricts innovation and adoption of automated PICO recognition systems. The largest-available PICO entity/span corpus is manually annotated which is too expensive for a majority of the scientific community. To break through the bottleneck, we propose DISTANT-CTO, a novel distantly supervised PICO entity extraction approach using the clinical trials literature, to generate a massive weakly-labeled dataset with more than a million ‘Intervention’ and ‘Comparator’ entity annotations. We train distant NER (named-entity recognition) models using this weakly-labeled dataset and demonstrate that it outperforms even the sophisticated models trained on the manually annotated dataset with a 2% F1 improvement over the Intervention entity of the PICO benchmark and more than 5% improvement when combined with the manually annotated dataset. We investigate the generalizability of our approach and gain an impressive F1 score on another domain-specific PICO benchmark. The approach is not only zero-cost but is also scalable for a constant stream of PICO entity annotations. %R 10.18653/v1/2022.bionlp-1.34 %U https://aclanthology.org/2022.bionlp-1.34 %U https://doi.org/10.18653/v1/2022.bionlp-1.34 %P 345-358