Social Media Medical Concept Normalization using RoBERTa in Ontology Enriched Text Similarity Framework

Katikapalli Subramanyam Kalyan, Sivanesan Sangeetha


Abstract
Pattisapu et al. (2020) formulate medical concept normalization (MCN) as text similarity problem and propose a model based on RoBERTa and graph embedding based target concept vectors. However, graph embedding techniques ignore valuable information available in the clinical ontology like concept description and synonyms. In this work, we enhance the model of Pattisapu et al. (2020) with two novel changes. First, we use retrofitted target concept vectors instead of graph embedding based vectors. It is the first work to leverage both concept description and synonyms to represent concepts in the form of retrofitted target concept vectors in text similarity framework based social media MCN. Second, we generate both concept and concept mention vectors with same size which eliminates the need of dense layers to project concept mention vectors into the target concept embedding space. Our model outperforms existing methods with improvements up to 3.75% on two standard datasets. Further when trained only on mapping lexicon synonyms, our model outperforms existing methods with significant improvements up to 14.61%. We attribute these significant improvements to the two novel changes introduced.
Anthology ID:
2020.knlp-1.3
Volume:
Proceedings of Knowledgeable NLP: the First Workshop on Integrating Structured Knowledge and Neural Networks for NLP
Month:
December
Year:
2020
Address:
Suzhou, China
Editors:
Oren Sar Shalom, Alexander Panchenko, Cicero dos Santos, Varvara Logacheva, Alessandro Moschitti, Ido Dagan
Venue:
knlp
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
21–26
Language:
URL:
https://aclanthology.org/2020.knlp-1.3
DOI:
Bibkey:
Cite (ACL):
Katikapalli Subramanyam Kalyan and Sivanesan Sangeetha. 2020. Social Media Medical Concept Normalization using RoBERTa in Ontology Enriched Text Similarity Framework. In Proceedings of Knowledgeable NLP: the First Workshop on Integrating Structured Knowledge and Neural Networks for NLP, pages 21–26, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Social Media Medical Concept Normalization using RoBERTa in Ontology Enriched Text Similarity Framework (Kalyan & Sangeetha, knlp 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.knlp-1.3.pdf