A Visualization Approach for Rapid Labeling of Clinical Notes for Smoking Status Extraction
Saman Enayati | Ziyu Yang | Benjamin Lu | Slobodan Vucetic
Proceedings of the Second Workshop on Data Science with Human in the Loop: Language Advances
Labeling is typically the most human-intensive step during the development of supervised learning models. In this paper, we propose a simple and easy-to-implement visualization approach that reduces cognitive load and increases the speed of text labeling. The approach is fine-tuned for task of extraction of patient smoking status from clinical notes. The proposed approach consists of the ordering of sentences that mention smoking, centering them at smoking tokens, and annotating to enhance informative parts of the text. Our experiments on clinical notes from the MIMIC-III clinical database demonstrate that our visualization approach enables human annotators to label sentences up to 3 times faster than with a baseline approach.