Identification of Risk Factors in Clinical Texts through Association Rules

Svetla Boytcheva, Ivelina Nikolova, Galia Angelova, Zhivko Angelov


Abstract
We describe a method which extracts Association Rules from texts in order to recognise verbalisations of risk factors. Usually some basic vocabulary about risk factors is known but medical conditions are expressed in clinical narratives with much higher variety. We propose an approach for data-driven learning of specialised medical vocabulary which, once collected, enables early alerting of potentially affected patients. The method is illustrated by experimens with clinical records of patients with Chronic Obstructive Pulmonary Disease (COPD) and comorbidity of CORD, Diabetes Melitus and Schizophrenia. Our input data come from the Bulgarian Diabetic Register, which is built using a pseudonymised collection of outpatient records for about 500,000 diabetic patients. The generated Association Rules for CORD are analysed in the context of demographic, gender, and age information. Valuable anounts of meaningful words, signalling risk factors, are discovered with high precision and confidence.
Anthology ID:
W17-8009
Volume:
Proceedings of the Biomedical NLP Workshop associated with RANLP 2017
Month:
September
Year:
2017
Address:
Varna, Bulgaria
Editors:
Svetla Boytcheva, Kevin Bretonnel Cohen, Guergana Savova, Galia Angelova
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
64–72
Language:
URL:
https://doi.org/10.26615/978-954-452-044-1_009
DOI:
10.26615/978-954-452-044-1_009
Bibkey:
Cite (ACL):
Svetla Boytcheva, Ivelina Nikolova, Galia Angelova, and Zhivko Angelov. 2017. Identification of Risk Factors in Clinical Texts through Association Rules. In Proceedings of the Biomedical NLP Workshop associated with RANLP 2017, pages 64–72, Varna, Bulgaria. INCOMA Ltd..
Cite (Informal):
Identification of Risk Factors in Clinical Texts through Association Rules (Boytcheva et al., RANLP 2017)
Copy Citation:
PDF:
https://doi.org/10.26615/978-954-452-044-1_009