Ruyue Agnes Bi


2023

pdf bib
Clinical BERTScore: An Improved Measure of Automatic Speech Recognition Performance in Clinical Settings
Joel Shor | Ruyue Agnes Bi | Subhashini Venugopalan | Steven Ibara | Roman Goldenberg | Ehud Rivlin
Proceedings of the 5th Clinical Natural Language Processing Workshop

Automatic Speech Recognition (ASR) in medical contexts has the potential to save time, cut costs, increase report accuracy, and reduce physician burnout. However, the healthcare industry has been slower to adopt this technology, in part due to the importance of avoiding medically-relevant transcription mistakes. In this work, we present the Clinical BERTScore (CBERTScore), an ASR metric that penalizes clinically-relevant mistakes more than others. We collect a benchmark of 18 clinician preferences on 149 realistic medical sentences called the Clinician Transcript Preference benchmark (CTP) and make it publicly available for the community to further develop clinically-aware ASR metrics. To our knowledge, this is the first public dataset of its kind. We demonstrate that our metric more closely aligns with clinician preferences on medical sentences as compared to other metrics (WER, BLUE, METEOR, etc), sometimes by wide margins.