DICE: Data-Efficient Clinical Event Extraction with Generative Models

Mingyu Derek Ma, Alexander Taylor, Wei Wang, Nanyun Peng


Abstract
Event extraction for the clinical domain is an under-explored research area. The lack of training data along with the high volume of domain-specific terminologies with vague entity boundaries makes the task especially challenging. In this paper, we introduce DICE, a robust and data-efficient generative model for clinical event extraction. DICE frames event extraction as a conditional generation problem and introduces a contrastive learning objective to accurately decide the boundaries of biomedical mentions. DICE also trains an auxiliary mention identification task jointly with event extraction tasks to better identify entity mention boundaries, and further introduces special markers to incorporate identified entity mentions as trigger and argument candidates for their respective tasks. To benchmark clinical event extraction, we compose MACCROBAT-EE, the first clinical event extraction dataset with argument annotation, based on an existing clinical information extraction dataset MACCROBAT. Our experiments demonstrate state-of-the-art performances of DICE for clinical and news domain event extraction, especially under low data settings.
Anthology ID:
2023.acl-long.886
Volume:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
15898–15917
Language:
URL:
https://aclanthology.org/2023.acl-long.886
DOI:
10.18653/v1/2023.acl-long.886
Bibkey:
Cite (ACL):
Mingyu Derek Ma, Alexander Taylor, Wei Wang, and Nanyun Peng. 2023. DICE: Data-Efficient Clinical Event Extraction with Generative Models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 15898–15917, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
DICE: Data-Efficient Clinical Event Extraction with Generative Models (Ma et al., ACL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.acl-long.886.pdf
Video:
 https://aclanthology.org/2023.acl-long.886.mp4