Siddharth Rawal

2024

pdf bib abs
Automatic sentence segmentation of clinical record narratives in real-world data
Dongfang Xu | Davy Weissenbacher | Karen O’Connor | Siddharth Rawal | Graciela Gonzalez Hernandez
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Sentence segmentation is a linguistic task and is widely used as a pre-processing step in many NLP applications. The need for sentence segmentation is particularly pronounced in clinical notes, where ungrammatical and fragmented texts are common. We propose a straightforward and effective sequence labeling classifier to predict sentence spans using a dynamic sliding window based on the prediction of each input sequence. This sliding window algorithm allows our approach to segment long text sequences on the fly. To evaluate our approach, we annotated 90 clinical notes from the MIMIC-III dataset. Additionally, we tested our approach on five other datasets to assess its generalizability and compared its performance against state-of-the-art systems on these datasets. Our approach outperformed all the systems, achieving an F1 score that is 15% higher than the next best-performing system on the clinical dataset.

2019

pdf bib abs
Identification of Adverse Drug Reaction Mentions in Tweets – SMM4H Shared Task 2019
Samarth Rawal | Siddharth Rawal | Saadat Anwar | Chitta Baral
Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task

Analyzing social media posts can offer insights into a wide range of topics that are commonly discussed online, providing valuable information for studying various health-related phenomena reported online. The outcome of this work can offer insights into pharmacovigilance research to monitor the adverse effects of medications. This research specifically looks into mentions of adverse drug reactions (ADRs) in Twitter data through the Social Media Mining for Health Applications (SMM4H) Shared Task 2019. Adverse drug reactions are undesired harmful effects which can arise from medication or other methods of treatment. The goal of this research is to build accurate models using natural language processing techniques to detect reports of adverse drug reactions in Twitter data and extract these words or phrases.

Co-authors

Davy Weissenbacher 1

Dongfang Xu 1

Venues

acl1
emnlp1

Fix data