%0 Conference Proceedings %T emrKBQA: A Clinical Knowledge-Base Question Answering Dataset %A Raghavan, Preethi %A Liang, Jennifer J. %A Mahajan, Diwakar %A Chandra, Rachita %A Szolovits, Peter %Y Demner-Fushman, Dina %Y Cohen, Kevin Bretonnel %Y Ananiadou, Sophia %Y Tsujii, Junichi %S Proceedings of the 20th Workshop on Biomedical Language Processing %D 2021 %8 June %I Association for Computational Linguistics %C Online %F raghavan-etal-2021-emrkbqa %X We present emrKBQA, a dataset for answering physician questions from a structured patient record. It consists of questions, logical forms and answers. The questions and logical forms are generated based on real-world physician questions and are slot-filled and answered from patients in the MIMIC-III KB through a semi-automated process. This community-shared release consists of over 940000 question, logical form and answer triplets with 389 types of questions and ~7.5 paraphrases per question type. We perform experiments to validate the quality of the dataset and set benchmarks for question to logical form learning that helps answer questions on this dataset. %R 10.18653/v1/2021.bionlp-1.7 %U https://aclanthology.org/2021.bionlp-1.7 %U https://doi.org/10.18653/v1/2021.bionlp-1.7 %P 64-73