Beyond Benchmarks: A Capability-Based Maturity Model for Systematic AI Integration in Hospitals

Rui Li; Xiaofen WU; Mingqian Liu; Xiaoxia Song; Xu Xiangjun; Jiacheng Qiao; Zhuang Boqin; Xu Chen

Beyond Benchmarks: A Capability-Based Maturity Model for Systematic AI Integration in Hospitals

Rui Li, Xiaofen WU, Mingqian Liu, Xiaoxia Song, Xu Xiangjun, Jiacheng Qiao, Zhuang Boqin, Xu Chen

Abstract

Current Large Language Models (LLMs) demonstrate exceptional performance on medical benchmarks. However, models that excel in standardized tests focused on medical knowledge recall are not necessarily effective in real-world healthcare scenarios. This disparity between academic performance and clinical effectiveness stems from existing evaluations focusing overly on knowledge retrieval and QA, while neglecting high-load executive tasks in real clinical workflows. The effective execution of such tasks depends not only on model reasoning but also on the overall digital maturity of the healthcare institution. To address this, we propose a “Capability-Based Hospital AI Maturity Model” framework. This framework establishes a layered maturity system based on capabilities. By categorizing hospital AI capabilities into distinct maturity levels, it provides a clear, stepwise evolutionary path for hospitals, guiding them from foundational infrastructure construction to ubiquitous intelligence. Guided by this framework, we constructed ten representative real-world clinical scenarios as a reference test set and compared the performance of multiple models across benchmarks and real-world scenarios. Preliminary results suggest that, compared to relying solely on academic benchmark scores, this maturity assessment mode—which integrates system governance and scenario constraints—may provide a more valuable basis for AI adoption in medical institutions.

Anthology ID:: 2026.findings-acl.1047
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 20883–20895
Language:
URL:: https://aclanthology.org/2026.findings-acl.1047/
DOI:
Bibkey:
Cite (ACL):: Rui Li, Xiaofen WU, Mingqian Liu, Xiaoxia Song, Xu Xiangjun, Jiacheng Qiao, Zhuang Boqin, and Xu Chen. 2026. Beyond Benchmarks: A Capability-Based Maturity Model for Systematic AI Integration in Hospitals. In Findings of the Association for Computational Linguistics: ACL 2026, pages 20883–20895, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Beyond Benchmarks: A Capability-Based Maturity Model for Systematic AI Integration in Hospitals (Li et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.1047.pdf
Checklist:: 2026.findings-acl.1047.checklist.pdf

PDF Cite Search Checklist Fix data