Zecong Tang
2025
CLEAR: A Clinically Grounded Tabular Framework for Radiology Report Evaluation
Yuyang Jiang
|
Chacha Chen
|
Shengyuan Wang
|
Feng Li
|
Zecong Tang
|
Benjamin M. Mervak
|
Lydia Chelala
|
Christopher M Straus
|
Reve Chahine
|
Samuel G. Armato Iii
|
Chenhao Tan
Findings of the Association for Computational Linguistics: EMNLP 2025
Existing metrics often lack the granularity and interpretability to capture nuanced clinical differences between candidate and ground-truth radiology reports, resulting in suboptimal evaluation. We introduce a **Cl**inically grounded tabular framework with **E**xpert-curated labels and **A**ttribute-level comparison for **R**adiology report evaluation (**CLEAR**). CLEAR not only examines whether a report can accurately identify the presence or absence of medical conditions, but it also assesses whether the report can precisely describe each positively identified condition across five key attributes: first occurrence, change, severity, descriptive location, and recommendation. Compared with prior works, CLEAR’s multi-dimensional, attribute-level outputs enable a more comprehensive and clinically interpretable evaluation of report quality. Additionally, to measure the clinical alignment of CLEAR, we collaborated with five board-certified radiologists to develop **CLEAR-Bench**, a dataset of 100 chest radiograph reports from MIMIC-CXR, annotated across 6 curated attributes and 13 CheXpert conditions. Our experiments demonstrated that CLEAR achieves high accuracy in extracting clinical attributes and provides automated metrics that are strongly aligned with clinical judgment.
Search
Fix author
Co-authors
- Reve Chahine 1
- Lydia Chelala 1
- Chacha Chen 1
- Samuel G. Armato Iii 1
- Yuyang Jiang 1
- show all...