ICDBigBird: A Contextual Embedding Model for ICD Code Classification

George Michalopoulos, Michal Malyska, Nicola Sahar, Alexander Wong, Helen Chen


Abstract
The International Classification of Diseases (ICD) system is the international standard for classifying diseases and procedures during a healthcare encounter and is widely used for healthcare reporting and management purposes. Assigning correct codes for clinical procedures is important for clinical, operational and financial decision-making in healthcare. Contextual word embedding models have achieved state-of-the-art results in multiple NLP tasks. However, these models have yet to achieve state-of-the-art results in the ICD classification task since one of their main disadvantages is that they can only process documents that contain a small number of tokens which is rarely the case with real patient notes. In this paper, we introduce ICDBigBird a BigBird-based model which can integrate a Graph Convolutional Network (GCN), that takes advantage of the relations between ICD codes in order to create ‘enriched’ representations of their embeddings, with a BigBird contextual model that can process larger documents. Our experiments on a real-world clinical dataset demonstrate the effectiveness of our BigBird-based model on the ICD classification task as it outperforms the previous state-of-the-art models.
Anthology ID:
2022.bionlp-1.32
Volume:
Proceedings of the 21st Workshop on Biomedical Language Processing
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Dina Demner-Fushman, Kevin Bretonnel Cohen, Sophia Ananiadou, Junichi Tsujii
Venue:
BioNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
330–336
Language:
URL:
https://aclanthology.org/2022.bionlp-1.32
DOI:
10.18653/v1/2022.bionlp-1.32
Bibkey:
Cite (ACL):
George Michalopoulos, Michal Malyska, Nicola Sahar, Alexander Wong, and Helen Chen. 2022. ICDBigBird: A Contextual Embedding Model for ICD Code Classification. In Proceedings of the 21st Workshop on Biomedical Language Processing, pages 330–336, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
ICDBigBird: A Contextual Embedding Model for ICD Code Classification (Michalopoulos et al., BioNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.bionlp-1.32.pdf
Video:
 https://aclanthology.org/2022.bionlp-1.32.mp4