Multi-stage Retrieve and Re-rank Model for Automatic Medical Coding Recommendation

Xindi Wang, Robert Mercer, Frank Rudzicz


Abstract
The International Classification of Diseases (ICD) serves as a definitive medical classification system encompassing a wide range of diseases and conditions. The primary objective of ICD indexing is to allocate a subset of ICD codes to a medical record, which facilitates standardized documentation and management of various health conditions. Most existing approaches have suffered from selecting the proper label subsets from an extremely large ICD collection with a heavy long-tailed label distribution. In this paper, we leverage a multi-stage “retrieve and re-rank” framework as a novel solution to ICD indexing, via a hybrid discrete retrieval method, and re-rank retrieved candidates with contrastive learning that allows the model to make more accurate predictions from a simplified label space. The retrieval model is a hybrid of auxiliary knowledge of the electronic health records (EHR) and a discrete retrieval method (BM25), which efficiently collects high-quality candidates. In the last stage, we propose a label co-occurrence guided contrastive re-ranking model, which re-ranks the candidate labels by pulling together the clinical notes with positive ICD codes. Experimental results show the proposed method achieves state-of-the-art performance on a number of measures on the MIMIC-III benchmark.
Anthology ID:
2024.naacl-long.273
Volume:
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Kevin Duh, Helena Gomez, Steven Bethard
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4881–4891
Language:
URL:
https://aclanthology.org/2024.naacl-long.273
DOI:
Bibkey:
Cite (ACL):
Xindi Wang, Robert Mercer, and Frank Rudzicz. 2024. Multi-stage Retrieve and Re-rank Model for Automatic Medical Coding Recommendation. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 4881–4891, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
Multi-stage Retrieve and Re-rank Model for Automatic Medical Coding Recommendation (Wang et al., NAACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.naacl-long.273.pdf
Copyright:
 2024.naacl-long.273.copyright.pdf