A Visualization Approach for Rapid Labeling of Clinical Notes for Smoking Status Extraction

Saman Enayati, Ziyu Yang, Benjamin Lu, Slobodan Vucetic


Abstract
Labeling is typically the most human-intensive step during the development of supervised learning models. In this paper, we propose a simple and easy-to-implement visualization approach that reduces cognitive load and increases the speed of text labeling. The approach is fine-tuned for task of extraction of patient smoking status from clinical notes. The proposed approach consists of the ordering of sentences that mention smoking, centering them at smoking tokens, and annotating to enhance informative parts of the text. Our experiments on clinical notes from the MIMIC-III clinical database demonstrate that our visualization approach enables human annotators to label sentences up to 3 times faster than with a baseline approach.
Anthology ID:
2021.dash-1.4
Volume:
Proceedings of the Second Workshop on Data Science with Human in the Loop: Language Advances
Month:
June
Year:
2021
Address:
Online
Editors:
Eduard Dragut, Yunyao Li, Lucian Popa, Slobodan Vucetic
Venue:
DaSH
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
24–30
Language:
URL:
https://aclanthology.org/2021.dash-1.4
DOI:
10.18653/v1/2021.dash-1.4
Bibkey:
Cite (ACL):
Saman Enayati, Ziyu Yang, Benjamin Lu, and Slobodan Vucetic. 2021. A Visualization Approach for Rapid Labeling of Clinical Notes for Smoking Status Extraction. In Proceedings of the Second Workshop on Data Science with Human in the Loop: Language Advances, pages 24–30, Online. Association for Computational Linguistics.
Cite (Informal):
A Visualization Approach for Rapid Labeling of Clinical Notes for Smoking Status Extraction (Enayati et al., DaSH 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.dash-1.4.pdf