%0 Conference Proceedings
%T emrKBQA: A Clinical Knowledge-Base Question Answering Dataset
%A Raghavan, Preethi
%A Liang, Jennifer J.
%A Mahajan, Diwakar
%A Chandra, Rachita
%A Szolovits, Peter
%Y Demner-Fushman, Dina
%Y Cohen, Kevin Bretonnel
%Y Ananiadou, Sophia
%Y Tsujii, Junichi
%S Proceedings of the 20th Workshop on Biomedical Language Processing
%D 2021
%8 June
%I Association for Computational Linguistics
%C Online
%F raghavan-etal-2021-emrkbqa
%X We present emrKBQA, a dataset for answering physician questions from a structured patient record. It consists of questions, logical forms and answers. The questions and logical forms are generated based on real-world physician questions and are slot-filled and answered from patients in the MIMIC-III KB through a semi-automated process. This community-shared release consists of over 940000 question, logical form and answer triplets with 389 types of questions and ~7.5 paraphrases per question type. We perform experiments to validate the quality of the dataset and set benchmarks for question to logical form learning that helps answer questions on this dataset.
%R 10.18653/v1/2021.bionlp-1.7
%U https://aclanthology.org/2021.bionlp-1.7
%U https://doi.org/10.18653/v1/2021.bionlp-1.7
%P 64-73