Accurate and Well-Calibrated ICD Code Assignment Through Attention Over Diverse Label Embeddings

Goncalo Gomes, Isabel Coutinho, Bruno Martins


Abstract
Although the International Classification of Diseases (ICD) has been adopted worldwide, manually assigning ICD codes to clinical text is time-consuming, error-prone, and expensive, motivating the development of automated approaches. This paper describes a novel approach for automated ICD coding, combining several ideas from previous related work. We specifically employ a strong Transformer-based model as a text encoder and, to handle lengthy clinical narratives, we explored either (a) adapting the base encoder model into a Longformer, or (b) dividing the text into chunks and processing each chunk independently. The representations produced by the encoder are combined with a label embedding mechanism that explores diverse ICD code synonyms. Experiments with different splits of the MIMIC-III dataset show that the proposed approach outperforms the current state-of-the-art models in ICD coding, with the label embeddings significantly contributing to the good performance. Our approach also leads to properly calibrated classification results, which can effectively inform downstream tasks such as quantification.
Anthology ID:
2024.eacl-long.141
Volume:
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
March
Year:
2024
Address:
St. Julian’s, Malta
Editors:
Yvette Graham, Matthew Purver
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2302–2315
Language:
URL:
https://aclanthology.org/2024.eacl-long.141
DOI:
Bibkey:
Cite (ACL):
Goncalo Gomes, Isabel Coutinho, and Bruno Martins. 2024. Accurate and Well-Calibrated ICD Code Assignment Through Attention Over Diverse Label Embeddings. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2302–2315, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):
Accurate and Well-Calibrated ICD Code Assignment Through Attention Over Diverse Label Embeddings (Gomes et al., EACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.eacl-long.141.pdf