Clinical Text Classification to SNOMED CT Codes Using Transformers Trained on Linked Open Medical Ontologies

Anton Hristov, Petar Ivanov, Anna Aksenova, Tsvetan Asamov, Pavlin Gyurov, Todor Primov, Svetla Boytcheva


Abstract
We present an approach for medical text coding with SNOMED CT. Our approach uses publicly available linked open data from terminologies and ontologies as training data for the algorithms. We claim that even small training corpora made of short text snippets can be used to train models for the given task. We propose a method based on transformers enhanced with clustering and filtering of the candidates. Further, we adopt a classical machine learning approach - support vector classification (SVC) using transformer embeddings. The resulting approach proves to be more accurate than the predictions given by Large Language Models. We evaluate on a dataset generated from linked open data for SNOMED codes related to morphology and topography for four use cases. Our transformers-based approach achieves an F1-score of 0.82 for morphology and 0.99 for topography codes. Further, we validate the applicability of our approach in a clinical context using labelled real clinical data that are not used for model training.
Anthology ID:
2023.ranlp-1.57
Volume:
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing
Month:
September
Year:
2023
Address:
Varna, Bulgaria
Editors:
Ruslan Mitkov, Galia Angelova
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
519–526
Language:
URL:
https://aclanthology.org/2023.ranlp-1.57
DOI:
Bibkey:
Cite (ACL):
Anton Hristov, Petar Ivanov, Anna Aksenova, Tsvetan Asamov, Pavlin Gyurov, Todor Primov, and Svetla Boytcheva. 2023. Clinical Text Classification to SNOMED CT Codes Using Transformers Trained on Linked Open Medical Ontologies. In Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing, pages 519–526, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
Clinical Text Classification to SNOMED CT Codes Using Transformers Trained on Linked Open Medical Ontologies (Hristov et al., RANLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.ranlp-1.57.pdf