Medical Concept Normalization in User-Generated Texts by Learning Target Concept Embeddings

Katikapalli Subramanyam Kalyan; Sivanesan Sangeetha

doi:10.18653/v1/2020.louhi-1.3

Medical Concept Normalization in User-Generated Texts by Learning Target Concept Embeddings

Katikapalli Subramanyam Kalyan, Sivanesan Sangeetha

Abstract

Medical concept normalization helps in discovering standard concepts in free-form text i.e., maps health-related mentions to standard concepts in a clinical knowledge base. It is much beyond simple string matching and requires a deep semantic understanding of concept mentions. Recent research approach concept normalization as either text classification or text similarity. The main drawback in existing a) text classification approach is ignoring valuable target concepts information in learning input concept mention representation b) text similarity approach is the need to separately generate target concept embeddings which is time and resource consuming. Our proposed model overcomes these drawbacks by jointly learning the representations of input concept mention and target concepts. First, we learn input concept mention representation using RoBERTa. Second, we find cosine similarity between embeddings of input concept mention and all the target concepts. Here, embeddings of target concepts are randomly initialized and then updated during training. Finally, the target concept with maximum cosine similarity is assigned to the input concept mention. Our model surpasses all the existing methods across three standard datasets by improving accuracy up to 2.31%.

Anthology ID:: 2020.louhi-1.3
Volume:: Proceedings of the 11th International Workshop on Health Text Mining and Information Analysis
Month:: November
Year:: 2020
Address:: Online
Editors:: Eben Holderness, Antonio Jimeno Yepes, Alberto Lavelli, Anne-Lyse Minard, James Pustejovsky, Fabio Rinaldi
Venue:: Louhi
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 18–23
Language:
URL:: https://aclanthology.org/2020.louhi-1.3
DOI:: 10.18653/v1/2020.louhi-1.3
Bibkey:
Cite (ACL):: Katikapalli Subramanyam Kalyan and Sivanesan Sangeetha. 2020. Medical Concept Normalization in User-Generated Texts by Learning Target Concept Embeddings. In Proceedings of the 11th International Workshop on Health Text Mining and Information Analysis, pages 18–23, Online. Association for Computational Linguistics.
Cite (Informal):: Medical Concept Normalization in User-Generated Texts by Learning Target Concept Embeddings (Kalyan & Sangeetha, Louhi 2020)
Copy Citation:
PDF:: https://aclanthology.org/2020.louhi-1.3.pdf
Video:: https://slideslive.com/38940046

PDF Cite Search Video