Natalia Semenova


2024

pdf bib
Biomedical Entity Representation with Graph-Augmented Multi-Objective Transformer
Andrey Sakhovskiy | Natalia Semenova | Artur Kadurin | Elena Tutubalina
Findings of the Association for Computational Linguistics: NAACL 2024

Modern biomedical concept representations are mostly trained on synonymous concept names from a biomedical knowledge base, ignoring the inter-concept interactions and a concept’s local neighborhood in a knowledge base graph. In this paper, we introduce Biomedical Entity Representation with a Graph-Augmented Multi-Objective Transformer (BERGAMOT), which adopts the power of pre-trained language models (LMs) and graph neural networks to capture both inter-concept and intra-concept interactions from the multilingual UMLS graph. To obtain fine-grained graph representations, we introduce two additional graph-based objectives: (i) a node-level contrastive objective and (ii) the Deep Graph Infomax (DGI) loss, which maximizes the mutual information between a local subgraph and a high-level graph summary. We apply contrastive loss on textual and graph representations to make them less sensitive to surface forms and enable intermodal knowledge exchange. BERGAMOT achieves state-of-the-art results in zero-shot entity linking without task-specific supervision on 4 of 5 languages of the Mantra corpus and on 8 of 10 languages of the XL-BEL benchmark.

2022

pdf bib
Transformer-based classification of premise in tweets related to COVID-19
Vadim Porvatov | Natalia Semenova
Proceedings of The Seventh Workshop on Social Media Mining for Health Applications, Workshop & Shared Task

Automation of social network data assessment is one of the classic challenges of natural language processing. During the COVID-19 pandemic, mining people’s stances from their public messages become crucial regarding the understanding of attitude towards health orders. In this paper, authors propose the transformer-based predictive model allowing to effectively classify presence of stance and premise in the Twitter texts.