Specializing Static and Contextual Embeddings in the Medical Domain Using Knowledge Graphs: Let’s Keep It Simple

Hicham El Boukkouri, Olivier Ferret, Thomas Lavergne, Pierre Zweigenbaum


Abstract
Domain adaptation of word embeddings has mainly been explored in the context of retraining general models on large specialized corpora. While this usually yields good results, we argue that knowledge graphs, which are used less frequently, could also be utilized to enhance existing representations with specialized knowledge. In this work, we aim to shed some light on whether such knowledge injection could be achieved using a basic set of tools: graph-level embeddings and concatenation. To that end, we adopt an incremental approach where we first demonstrate that static embeddings can indeed be improved through concatenation with in-domain node2vec representations. Then, we validate this approach on contextual models and generalize it further by proposing a variant of BERT that incorporates knowledge embeddings within its hidden states through the same process of concatenation. We show that this variant outperforms plain retraining on several specialized tasks, then discuss how this simple approach could be improved further. Both our code and pre-trained models are open-sourced for future research. In this work, we conduct experiments that target the medical domain and the English language.
Anthology ID:
2022.louhi-1.9
Volume:
Proceedings of the 13th International Workshop on Health Text Mining and Information Analysis (LOUHI)
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates (Hybrid)
Editors:
Alberto Lavelli, Eben Holderness, Antonio Jimeno Yepes, Anne-Lyse Minard, James Pustejovsky, Fabio Rinaldi
Venue:
Louhi
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
69–80
Language:
URL:
https://aclanthology.org/2022.louhi-1.9
DOI:
10.18653/v1/2022.louhi-1.9
Bibkey:
Cite (ACL):
Hicham El Boukkouri, Olivier Ferret, Thomas Lavergne, and Pierre Zweigenbaum. 2022. Specializing Static and Contextual Embeddings in the Medical Domain Using Knowledge Graphs: Let’s Keep It Simple. In Proceedings of the 13th International Workshop on Health Text Mining and Information Analysis (LOUHI), pages 69–80, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
Cite (Informal):
Specializing Static and Contextual Embeddings in the Medical Domain Using Knowledge Graphs: Let’s Keep It Simple (El Boukkouri et al., Louhi 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.louhi-1.9.pdf
Video:
 https://aclanthology.org/2022.louhi-1.9.mp4