Mohit Gupta


2025

pdf bib
Med-CoDE: Medical Critique based Disagreement Evaluation Framework
Mohit Gupta | Akiko Aizawa | Rajiv Ratn Shah
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 4: Student Research Workshop)

The emergence of large language models (LLMs) has significantly influenced numerous fields, including healthcare, by enhancing the capabilities of automated systems to process and generate human-like text. However, despite their advancements, the reliability and accuracy of LLMs in medical contexts remain critical concerns. Current evaluation methods often lack robustness and fail to provide a comprehensive assessment of LLM performance, leading to potential risks in clinical settings. In this work, we propose Med-CoDE, a specifically designed evaluation framework for medical LLMs to address these challenges. The framework leverages a critique-based approach to quantitatively measure the degree of disagreement between model-generated responses and established medical ground truths. This framework captures both accuracy and reliability in medical settings. The proposed evaluation framework aims to fill the existing gap in LLM assessment by offering a systematic method to evaluate the quality and trustworthiness of medical LLMs. Through extensive experiments and case studies, we illustrate the practicality of our framework in providing a comprehensive and reliable evaluation of medical LLMs.

2023

pdf bib
Image Manipulation via Multi-Hop Instructions - A New Dataset and Weakly-Supervised Neuro-Symbolic Approach
Harman Singh | Poorva Garg | Mohit Gupta | Kevin Shah | Ashish Goswami | Satyam Modi | Arnab Mondal | Dinesh Khandelwal | Dinesh Garg | Parag Singla
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

We are interested in image manipulation via natural language text – a task that is useful for multiple AI applications but requires complex reasoning over multi-modal spaces. We extend recently proposed Neuro Symbolic Concept Learning (NSCL), which has been quite effective for the task of Visual Question Answering (VQA), for the task of image manipulation. Our system referred to as NeuroSIM can perform complex multi-hop reasoning over multi-object scenes and only requires weak supervision in the form of annotated data for VQA. NeuroSIM parses an instruction into a symbolic program, based on a Domain Specific Language (DSL) comprising of object attributes and manipulation operations, that guides its execution. We create a new dataset for the task, and extensive experiments demonstrate that NeuroSIM is highly competitive with or beats SOTA baselines that make use of supervised data for manipulation.

2015

pdf bib
Adjective Intensity and Sentiment Analysis
Raksha Sharma | Mohit Gupta | Astha Agarwal | Pushpak Bhattacharyya
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Shallow Discourse Parsing with Syntactic and (a Few) Semantic Features
Shubham Mukherjee | Abhishek Tiwari | Mohit Gupta | Anil Kumar Singh
Proceedings of the Nineteenth Conference on Computational Natural Language Learning - Shared Task