Travel on the ICD Tree: Benchmarking Agentic Reasoning for ICD Coding from Chinese Electronic Medical Records

Xinjie Xu; Yongqi Fan; Shuang-shuang Chen; Qi Ye; Weibin Guo; Xinxuan Hu

Travel on the ICD Tree: Benchmarking Agentic Reasoning for ICD Coding from Chinese Electronic Medical Records

Xinjie Xu, Yongqi Fan, Shuang-shuang Chen, Qi Ye, Weibin Guo, Xinxuan Hu

Abstract

Accurate International Classification of Diseases (ICD) coding is crucial for hospital management and healthcare data governance. In clinical practice, straightforward cases can often be matched directly to ICD codes via diagnostic text, establishing retrieval-based methods as the baseline. More advanced approaches leverage large language models to rerank these results. However, real-world coding scenarios are typically more complex, demanding reasoning that goes beyond superficial descriptions. For instance, it involves synthesizing key information such as disease subtype, anatomical location, and complications from complex progress notes to accurately identify the primary diagnosis. However, a comprehensive evaluation framework for ICD coding based on complete EMRs is still lacking. To address these challenges, we constructed the Code4Detail dataset, which comprises 560 real clinical records covering 434 common diseases across 19 core chapters of ICD-10. To systematically explore the capability boundaries of large language models under different paradigms, we further propose the Travel on the ICD Tree (ToT-ICD) evaluation framework. Unlike the conventional retrieval-recall approach, ToT-ICD treats ICD coding as a structured exploration process across a hierarchical taxonomy. We design an agentic workflow that integrates similarity retrieval, path-guided navigation, and dynamic backtracking, enabling logical reasoning and decision-making under coding rules.

Anthology ID:: 2026.findings-acl.191
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3927–3943
Language:
URL:: https://aclanthology.org/2026.findings-acl.191/
DOI:
Bibkey:
Cite (ACL):: Xinjie Xu, Yongqi Fan, Shuang-shuang Chen, Qi Ye, Weibin Guo, and Xinxuan Hu. 2026. Travel on the ICD Tree: Benchmarking Agentic Reasoning for ICD Coding from Chinese Electronic Medical Records. In Findings of the Association for Computational Linguistics: ACL 2026, pages 3927–3943, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Travel on the ICD Tree: Benchmarking Agentic Reasoning for ICD Coding from Chinese Electronic Medical Records (Xu et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.191.pdf
Checklist:: 2026.findings-acl.191.checklist.pdf

PDF Cite Search Checklist Fix data