Simone Scaboro


2023

pdf bib
Boosting Adverse Drug Event Normalization on Social Media: General-Purpose Model Initialization and Biomedical Semantic Text Similarity Benefit Zero-Shot Linking in Informal Contexts
François Remy | Simone Scaboro | Beatrice Portelli
Proceedings of the 11th International Workshop on Natural Language Processing for Social Media

pdf bib
Improving Multi-lingual Medical Term Normalization to Address the Long-Tail Problem
Beatrice Portelli | Simone Scaboro | Giuseppe Serra
Proceedings of the 37th Pacific Asia Conference on Language, Information and Computation

2022

pdf bib
AILAB-Udine@SMM4H’22: Limits of Transformers and BERT Ensembles
Beatrice Portelli | Simone Scaboro | Emmanuele Chersoni | Enrico Santus | Giuseppe Serra
Proceedings of The Seventh Workshop on Social Media Mining for Health Applications, Workshop & Shared Task

This paper describes the models developed by the AILAB-Udine team for the SMM4H’22 Shared Task. We explored the limits of Transformer based models on text classification, entity extraction and entity normalization, tackling Tasks 1, 2, 5, 6 and 10. The main takeaways we got from participating in different tasks are: the overwhelming positive effects of combining different architectures when using ensemble learning, and the great potential of generative models for term normalization.

pdf bib
Generalizing over Long Tail Concepts for Medical Term Normalization
Beatrice Portelli | Simone Scaboro | Enrico Santus | Hooman Sedghamiz | Emmanuele Chersoni | Giuseppe Serra
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Medical term normalization consists in mapping a piece of text to a large number of output classes. Given the small size of the annotated datasets and the extremely long tail distribution of the concepts, it is of utmost importance to develop models that are capable to generalize to scarce or unseen concepts. An important attribute of most target ontologies is their hierarchical structure. In this paper we introduce a simple and effective learning strategy that leverages such information to enhance the generalizability of both discriminative and generative models. The evaluation shows that the proposed strategy produces state-of-the-art performance on seen concepts and consistent improvements on unseen ones, allowing also for efficient zero-shot knowledge transfer across text typologies and datasets.

2021

pdf bib
NADE: A Benchmark for Robust Adverse Drug Events Extraction in Face of Negations
Simone Scaboro | Beatrice Portelli | Emmanuele Chersoni | Enrico Santus | Giuseppe Serra
Proceedings of the Seventh Workshop on Noisy User-generated Text (W-NUT 2021)

Adverse Drug Event (ADE) extraction models can rapidly examine large collections of social media texts, detecting mentions of drug-related adverse reactions and trigger medical investigations. However, despite the recent advances in NLP, it is currently unknown if such models are robust in face of negation, which is pervasive across language varieties. In this paper we evaluate three state-of-the-art systems, showing their fragility against negation, and then we introduce two possible strategies to increase the robustness of these models: a pipeline approach, relying on a specific component for negation detection; an augmentation of an ADE extraction dataset to artificially create negated samples and further train the models. We show that both strategies bring significant increases in performance, lowering the number of spurious entities predicted by the models. Our dataset and code will be publicly released to encourage research on the topic.