Automatic Transformation of Clinical Narratives into Structured Format

Sylvia Vassileva, Gergana Todorova, Kristina Ivanova, Boris Velichkov, Ivan Koychev, Galia Angelova, Svetla Boytcheva


Abstract
Vast amounts of data in healthcare are available in unstructured text format, usually in the local language of the countries. These documents contain valuable information. Secondary use of clinical narratives and information extraction of key facts and relations from them about the patient disease history can foster preventive medicine and improve healthcare. In this paper, we propose a hybrid method for the automatic transformation of clinical text into a structured format. The documents are automatically sectioned into the following parts: diagnosis, patient history, patient status, lab results. For the “Diagnosis” section a deep learning text-based encoding into ICD-10 codes is applied using MBG-ClinicalBERT - a fine-tuned ClinicalBERT model for Bulgarian medical text. From the “Patient History” section, we identify patient symptoms using a rule-based approach enhanced with similarity search based on MBG-ClinicalBERT word embeddings. We also identify symptom relations like negation. For the “Patient Status” description, binary classification is used to determine the status of each anatomic organ. In this paper, we demonstrate different methods for adapting NLP tools for English and other languages to a low resource language like Bulgarian.
Anthology ID:
2021.ranlp-srw.30
Volume:
Proceedings of the Student Research Workshop Associated with RANLP 2021
Month:
September
Year:
2021
Address:
Online
Editors:
Souhila Djabri, Dinara Gimadi, Tsvetomila Mihaylova, Ivelina Nikolova-Koleva
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
219–227
Language:
URL:
https://aclanthology.org/2021.ranlp-srw.30
DOI:
Bibkey:
Cite (ACL):
Sylvia Vassileva, Gergana Todorova, Kristina Ivanova, Boris Velichkov, Ivan Koychev, Galia Angelova, and Svetla Boytcheva. 2021. Automatic Transformation of Clinical Narratives into Structured Format. In Proceedings of the Student Research Workshop Associated with RANLP 2021, pages 219–227, Online. INCOMA Ltd..
Cite (Informal):
Automatic Transformation of Clinical Narratives into Structured Format (Vassileva et al., RANLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.ranlp-srw.30.pdf