Silvia Laura Bosello


2024

pdf bib
Lupus Alberto: A Transformer-Based Approach for SLE Information Extraction from Italian Clinical Reports
Livia Lilli | Laura Antenucci | Augusta Ortolan | Silvia Laura Bosello | Maria Antonietta D’agostino | Stefano Patarnello | Carlotta Masciocchi | Jacopo Lenkowicz
Proceedings of the 10th Italian Conference on Computational Linguistics (CLiC-it 2024)

Natural Language Processing (NLP) is widely used across several fields, particularly in medicine, where information often originates from unstructured data sources. This creates the need for automated systems, in order to classify text and extract information from Electronic Health Records (EHRs). However, a significant challenge lies in the limited availability of pre-trained models for less common languages, such as Italian, and for specific medical domains.Our study aims to develop an NLP approach to extract Systemic Lupus Erythematosus (SLE) information from Italian EHRs at Gemelli Hospital in Rome. We then introduce Lupus Alberto, a fine-tuned version of AlBERTo, trained for classifying categories derived from three distinct domains: Diagnosis, Therapy and Symptom. We evaluated Lupus Alberto’s performance by comparing it with other baseline approaches, selecting from available BERT-based models for the Italian language and fine-tuning them for the same tasks.Evaluation results show that Lupus Alberto achieves overall F-Scores equal to 79%, 87%, and 76% for the Diagnosis, Therapy, and Symptom domains, respectively. Furthermore, our approach outperformed other baseline models in the Diagnosis and Symptom domains, demonstrating superior performance in identifying and categorizing relevant SLE information, thereby improving clinical decision-making and patient management.