MeDiSumQA: Patient-Oriented Question-Answer Generation from Discharge Letters

Amin Dada; Osman Koraş; Marie Bauer; Amanda Butler; Kaleb Smith; Jens Kleesiek; Julian Friedrich

doi:10.18653/v1/2025.cl4health-1.10

MeDiSumQA: Patient-Oriented Question-Answer Generation from Discharge Letters

Amin Dada, Osman Koras, Marie Bauer, Amanda Butler, Kaleb Smith, Jens Kleesiek, Julian Friedrich

Abstract

While increasing patients’ access to medical documents improves medical care, this benefit is limited by varying health literacy levels and complex medical terminology. Large language models (LLMs) offer solutions by simplifying medical information. However, evaluating LLMs for safe and patient-friendly text generation is difficult due to the lack of standardized evaluation resources. To fill this gap, we developed MeDiSumQA. MeDiSumQA is a dataset created from MIMIC-IV discharge summaries through an automated pipeline combining LLM-based question-answer generation with manual quality checks. We use this dataset to evaluate various LLMs on patient-oriented question-answering. Our findings reveal that general-purpose LLMs frequently surpass biomedical-adapted models, while automated metrics correlate with human judgment. By releasing MeDiSumQA on PhysioNet, we aim to advance the development of LLMs to enhance patient understanding and ultimately improve care outcomes.

Anthology ID:: 2025.cl4health-1.10
Volume:: Proceedings of the Second Workshop on Patient-Oriented Language Processing (CL4Health)
Month:: May
Year:: 2025
Address:: Albuquerque, New Mexico
Editors:: Sophia Ananiadou, Dina Demner-Fushman, Deepak Gupta, Paul Thompson
Venues:: CL4Health | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 124–136
Language:
URL:: https://aclanthology.org/2025.cl4health-1.10/
DOI:: 10.18653/v1/2025.cl4health-1.10
Bibkey:
Cite (ACL):: Amin Dada, Osman Koras, Marie Bauer, Amanda Butler, Kaleb Smith, Jens Kleesiek, and Julian Friedrich. 2025. MeDiSumQA: Patient-Oriented Question-Answer Generation from Discharge Letters. In Proceedings of the Second Workshop on Patient-Oriented Language Processing (CL4Health), pages 124–136, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):: MeDiSumQA: Patient-Oriented Question-Answer Generation from Discharge Letters (Dada et al., CL4Health 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.cl4health-1.10.pdf

PDF Cite Search Fix data