Overview of the MedGenVidQA 2026 Shared Task on Medical Generative Video Question Answering

Deepak Gupta; Collin Campbell; Pedram Golnari; Dina Demner-Fushman

Overview of the MedGenVidQA 2026 Shared Task on Medical Generative Video Question Answering

Deepak Gupta, Collin Campbell, Pedram Golnari, Dina Demner-Fushman

Abstract

This paper presents an overview of the MedGenVidQA 2026 shared task on medical video question answering, collocated with the 25th BioNLP workshop at ACL 2026. The shared task addressed three related sub-tasks of the medical multimodal (textual and video) question answering: (i) multimodal retrieval tasks, (ii) multimodal answer generation with citations, and (iii) a visual answer localization task. The key theme of the stated task is to develop reliable multimodal question answering systems for consumers and medical professionals by leveraging generative models. A total of nine teams participated in the shared task challenges and submitted a total of forty-three submissions across all tasks. We performed both automated and human assessments to evaluate the submissions. This paper describes the tasks, datasets, evaluation metrics, participation, and baseline systems for all three tasks. Additionally, we summarize the techniques and results of the evaluation of the various approaches explored by the participating teams. Finally, we discuss the key findings and implications for the development of multimodal medical question answering.

Anthology ID:: 2026.bionlp-1.88
Volume:: BioNLP 2026
Month:: July
Year:: 2026
Address:: San Diego, California
Editors:: Dina Demner-Fushman, Sophia Ananiadou, Kirk Roberts, Junichi Tsujii
Venues:: BioNLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1089–1100
Language:
URL:: https://aclanthology.org/2026.bionlp-1.88/
DOI:
Bibkey:
Cite (ACL):: Deepak Gupta, Collin Campbell, Pedram Golnari, and Dina Demner-Fushman. 2026. Overview of the MedGenVidQA 2026 Shared Task on Medical Generative Video Question Answering. In BioNLP 2026, pages 1089–1100, San Diego, California. Association for Computational Linguistics.
Cite (Informal):: Overview of the MedGenVidQA 2026 Shared Task on Medical Generative Video Question Answering (Gupta et al., BioNLP 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.bionlp-1.88.pdf

PDF Cite Search Fix data