Sudeshna Jana


2024

pdf bib
PollCardioKG: A Dynamic Knowledge Graph of Interaction Between Pollution and Cardiovascular Diseases
Sudeshna Jana | Anunak Roy | Manjira Sinha | Tirthankar Dasgupta
Proceedings of the 21st International Conference on Natural Language Processing (ICON)

In recent decades, environmental pollution has become a pressing global health concern. According to the World Health Organization (WHO), a significant portion of the population is exposed to air pollutant levels exceeding safety guidelines. Cardiovascular diseases (CVDs) — including coronary artery disease, heart attacks, and strokes — are particularly significant health effects of this exposure. In this paper, we investigate the effects of air pollution on cardiovascular health by constructing a dynamic knowledge graph based on extensive biomedical literature. This paper provides a comprehensive exploration of entity identification and relation extraction, leveraging advanced language models. Additionally, we demonstrate how in-context learning with large language models can enhance the accuracy and efficiency of the extraction process. The constructed knowledge graph enables us to analyze the relationships between pollutants and cardiovascular diseases over the years, providing deeper insights into the long-term impact of cumulative exposure, underlying causal mechanisms, vulnerable populations, and the role of emerging contaminants in worsening various cardiac outcomes.

pdf bib
FORCE: A Benchmark Dataset for Foodborne Disease Outbreak and Recall Event Extraction from News
Sudeshna Jana | Manjira Sinha | Tirthankar Dasgupta
Proceedings of The 9th Social Media Mining for Health Research and Applications (SMM4H 2024) Workshop and Shared Tasks

The escalating prevalence of food safety incidents within the food supply chain necessitates immediate action to protect consumers. These incidents encompass a spectrum of issues, including food product contamination and deliberate food and feed adulteration for economic gain leading to outbreaks and recalls. Understanding the origins and pathways of contamination is imperative for prevention and mitigation. In this paper, we introduce FORCE Foodborne disease Outbreak and ReCall Event extraction from openweb). Our proposed model leverages a multi-tasking sequence labeling architecture in conjunction with transformer-based document embeddings. We have compiled a substantial annotated corpus comprising relevant articles published between 2011 and 2023 to train and evaluate the model. The dataset will be publicly released with the paper. The event detection model demonstrates fair accuracy in identifying food-related incidents and outbreaks associated with organizations, as assessed through cross-validation techniques.

2022

pdf bib
ATL at FinCausal 2022: Transformer Based Architecture for Automatic Causal Sentence Detection and Cause-Effect Extraction
Abir Naskar | Tirthankar Dasgupta | Sudeshna Jana | Lipika Dey
Proceedings of the 4th Financial Narrative Processing Workshop @LREC2022

Automatic extraction of cause-effect relationships from natural language texts is a challenging open problem in Artificial Intelligence. Most of the early attempts at its solution used manually constructed linguistic and syntactic rules on restricted domain data sets. With the advent of big data, and the recent popularization of deep learning, the paradigm to tackle this problem has slowly shifted. In this work we proposed a transformer based architecture to automatically detect causal sentences from textual mentions and then identify the corresponding cause-effect relations. We describe our submission to the FinCausal 2022 shared task based on this method. Our model achieves a F1-score of 0.99 for the Task-1 and F1-score of 0.60 for Task-2 on the shared task data set on financial documents.