Yatin Nandwani
2024
Few shot chain-of-thought driven reasoning to prompt LLMs for open-ended medical question answering
Saeel Sandeep Nachane
|
Ojas Gramopadhye
|
Prateek Chanda
|
Ganesh Ramakrishnan
|
Kshitij Sharad Jadhav
|
Yatin Nandwani
|
Dinesh Raghu
|
Sachindra Joshi
Findings of the Association for Computational Linguistics: EMNLP 2024
In this paper, we propose a modified version of the MedQA-USMLE dataset, named MEDQA-OPEN, which contains open-ended medical questions without options to mimic clinical scenarios, along with clinician-approved reasoned answers. Additionally, we implement a prompt driven by Chain of Thought (CoT) reasoning, CLINICR, to mirror the prospective process of incremental reasoning, reaching a correct response to medical questions. We empirically demonstrate how CLINICR outperforms the state-of-the-art 5-shot CoT-based prompt (Liévin et al., 2022). We also present an approach that mirrors real-life clinical practice by first exploring multiple differential diagnoses through MCQ-CLINICR and subsequently narrowing down to a final diagnosis using MCQ-ELIMINATIVE. Finally, emphasizing the importance of response verification in medical settings, we utilize a reward model mechanism, replacing the elimination process performed by MCQ-ELIMINATIVE.
2023
Pointwise Mutual Information Based Metric and Decoding Strategy for Faithful Generation in Document Grounded Dialogs
Yatin Nandwani
|
Vineet Kumar
|
Dinesh Raghu
|
Sachindra Joshi
|
Luis Lastras
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
A major concern in using deep learning based generative models for document-grounded dialogs is the potential generation of responses that are not faithful to the underlying document. Existing automated metrics used for evaluating the faithfulness of response with respect to the grounding document measure the degree of similarity between the generated response and the document’s content. However, these automated metrics are far from being well aligned with human judgments. Therefore, to improve the measurement of faithfulness, we propose a new metric that utilizes (Conditional) Point-wise Mutual Information (PMI) between the generated response and the source document, conditioned on the dialogue. PMI quantifies the extent to which the document influences the generated response – with a higher PMI indicating a more faithful response. We build upon this idea to create a new decoding technique that incorporates PMI into the response generation process to predict more faithful responses. Our experiments on the BEGIN benchmark demonstrate an improved correlation of our metric with human evaluation. We also show that our decoding technique is effective in generating more faithful responses when compared to standard decoding techniques on a set of publicly available document-grounded dialog datasets.
Search
Co-authors
- Dinesh Raghu 2
- Sachindra Joshi 2
- Vineet Kumar 1
- Luis Lastras 1
- Saeel Sandeep Nachane 1
- show all...