Low Resource ICD Coding of Hospital Discharge Summaries

Ashton Williamson, David de Hilster, Amnon Meyers, Nina Hubig, Amy Apon


Abstract
Medical coding is the process by which standardized medical codes are assigned to patient health records. This is a complex and challenging task that typically requires an expert human coder to review health records and assign codes from a classification system based on a standard set of rules. Since health records typically consist of a large proportion of free-text documents, this problem has traditionally been approached as a natural language processing (NLP) task. While machine learning-based methods have seen recent popularity on this task, they tend to struggle with codes that are assigned less frequently, for which little or no training data exists. In this work we utilize the open-source NLP programming language, NLP++, to design and build an automated system to assign International Classification of Diseases (ICD) codes to discharge summaries that functions in the absence of labeled training data. We evaluate our system using the MIMIC-III dataset and find that for codes with little training data, our approach achieves competitive performance compared to state-of-the-art machine learning approaches.
Anthology ID:
2024.bionlp-1.45
Volume:
Proceedings of the 23rd Workshop on Biomedical Natural Language Processing
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Dina Demner-Fushman, Sophia Ananiadou, Makoto Miwa, Kirk Roberts, Junichi Tsujii
Venues:
BioNLP | WS
SIG:
SIGBIOMED
Publisher:
Association for Computational Linguistics
Note:
Pages:
548–558
Language:
URL:
https://aclanthology.org/2024.bionlp-1.45
DOI:
10.18653/v1/2024.bionlp-1.45
Bibkey:
Cite (ACL):
Ashton Williamson, David de Hilster, Amnon Meyers, Nina Hubig, and Amy Apon. 2024. Low Resource ICD Coding of Hospital Discharge Summaries. In Proceedings of the 23rd Workshop on Biomedical Natural Language Processing, pages 548–558, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Low Resource ICD Coding of Hospital Discharge Summaries (Williamson et al., BioNLP-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.bionlp-1.45.pdf