Arkadeep Acharya
2024
HealthAlignSumm : Utilizing Alignment for Multimodal Summarization of Code-Mixed Healthcare Dialogues
Akash Ghosh
|
Arkadeep Acharya
|
Sriparna Saha
|
Gaurav Pandey
|
Dinesh Raghu
|
Setu Sinha
Findings of the Association for Computational Linguistics: EMNLP 2024
As generative AI progresses, collaboration be-tween doctors and AI scientists is leading to thedevelopment of personalized models to stream-line healthcare tasks and improve productivity.Summarizing doctor-patient dialogues has be-come important, helping doctors understandconversations faster and improving patient care.While previous research has mostly focused ontext data, incorporating visual cues from pa-tient interactions allows doctors to gain deeperinsights into medical conditions. Most of thisresearch has centered on English datasets, butreal-world conversations often mix languagesfor better communication. To address the lackof resources for multimodal summarization ofcode-mixed dialogues in healthcare, we devel-oped the MCDH dataset. Additionally, we cre-ated HealthAlignSumm, a new model that in-tegrates visual components with the BART ar-chitecture. This represents a key advancementin multimodal fusion, applied within both theencoder and decoder of the BART model. Ourwork is the first to use alignment techniques,including state-of-the-art algorithms like DirectPreference Optimization, on encoder-decodermodels with synthetic datasets for multimodalsummarization. Through extensive experi-ments, we demonstrated the superior perfor-mance of HealthAlignSumm across severalmetrics validated by both automated assess-ments and human evaluations. The datasetMCDH and our proposed model HealthAlign-Summ will be available in this GitHub accounthttps://github.com/AkashGhosh/HealthAlignSumm-Utilizing-Alignment-for-Multimodal-Summarization-of-Code-Mixed-Healthcare-DialoguesDisclaimer: This work involves medical im-agery based on the subject matter of the topic.
2023
Do Language Models Have a Common Sense regarding Time? Revisiting Temporal Commonsense Reasoning in the Era of Large Language Models
Raghav Jain
|
Daivik Sojitra
|
Arkadeep Acharya
|
Sriparna Saha
|
Adam Jatowt
|
Sandipan Dandapat
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Temporal reasoning represents a vital component of human communication and understanding, yet remains an underexplored area within the context of Large Language Models (LLMs). Despite LLMs demonstrating significant proficiency in a range of tasks, a comprehensive, large-scale analysis of their temporal reasoning capabilities is missing. Our paper addresses this gap, presenting the first extensive benchmarking of LLMs on temporal reasoning tasks. We critically evaluate 8 different LLMs across 6 datasets using 3 distinct prompting strategies. Additionally, we broaden the scope of our evaluation by including in our analysis 2 Code Generation LMs. Beyond broad benchmarking of models and prompts, we also conduct a fine-grained investigation of performance across different categories of temporal tasks. We further analyze the LLMs on varying temporal aspects, offering insights into their proficiency in understanding and predicting the continuity, sequence, and progression of events over time. Our findings reveal a nuanced depiction of the capabilities and limitations of the models within temporal reasoning, offering a comprehensive reference for future research in this pivotal domain.
Search
Fix data
Co-authors
- Sriparna Saha 2
- Sandipan Dandapat 1
- Akash Ghosh 1
- Raghav Jain 1
- Adam Jatowt 1
- show all...