Saipriya Dipika Vaidyanathan


2025

This paper presents a fully automated pipeline for normalizing adverse drug event (ADE) mentions identified in user-generated medical texts, to MedDRA concepts. The core approach here is a hybrid retrieval architecture combining domain-specific phrase normalization, synonym augmentation, and explicit mappings for key symptoms, thereby improving coverage of lexical variants. For candidate generation, the system employs a blend of exact dictionary lookups and fuzzy matching, supplemented by drug-specific contextual scoring. A sentencetransformer model (distilroberta-v1) was finetuned on augmented phrases, with reciprocal rank fusion unifying multiple retrieval signals.
Transformer-based multilingual question-answering models are used to detect causality in financial text data. This study employs BERT (CITATION) for English text and XLM-RoBERTa (CITATION) for Spanish data, which were fine-tuned on the SQuAD datasets (CITATION) (CITATION). These pre-trained models are used to extract answers to the targeted questions. We design a system using these pre-trained models to answer questions, based on the given context. The results validate the effectiveness of the systems in understanding nuanced financial language and offers a tool for multi-lingual text analysis. Our system is able to achieve SAS scores of 0.75 in Spanish and 0.82 in English.