Hierarchical Pretraining on Multimodal Electronic Health Records

Xiaochen Wang, Junyu Luo, Jiaqi Wang, Ziyi Yin, Suhan Cui, Yuan Zhong, Yaqing Wang, Fenglong Ma


Abstract
Pretraining has proven to be a powerful technique in natural language processing (NLP), exhibiting remarkable success in various NLP downstream tasks. However, in the medical domain, existing pretrained models on electronic health records (EHR) fail to capture the hierarchical nature of EHR data, limiting their generalization capability across diverse downstream tasks using a single pretrained model. To tackle this challenge, this paper introduces a novel, general, and unified pretraining framework called MedHMP, specifically designed for hierarchically multimodal EHR data. The effectiveness of the proposed MedHMP is demonstrated through experimental results on eight downstream tasks spanning three levels. Comparisons against eighteen baselines further highlight the efficacy of our approach.
Anthology ID:
2023.emnlp-main.171
Volume:
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2839–2852
Language:
URL:
https://aclanthology.org/2023.emnlp-main.171
DOI:
10.18653/v1/2023.emnlp-main.171
Bibkey:
Cite (ACL):
Xiaochen Wang, Junyu Luo, Jiaqi Wang, Ziyi Yin, Suhan Cui, Yuan Zhong, Yaqing Wang, and Fenglong Ma. 2023. Hierarchical Pretraining on Multimodal Electronic Health Records. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 2839–2852, Singapore. Association for Computational Linguistics.
Cite (Informal):
Hierarchical Pretraining on Multimodal Electronic Health Records (Wang et al., EMNLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.emnlp-main.171.pdf
Video:
 https://aclanthology.org/2023.emnlp-main.171.mp4