Barbara Koroušić Seljak


pdf bib
SAFFRON: tranSfer leArning For Food-disease RelatiOn extractioN
Gjorgjina Cenikj | Tome Eftimov | Barbara Koroušić Seljak
Proceedings of the 20th Workshop on Biomedical Language Processing

The accelerating growth of big data in the biomedical domain, with an endless amount of electronic health records and more than 30 million citations and abstracts in PubMed, introduces the need for automatic structuring of textual biomedical data. In this paper, we develop a method for detecting relations between food and disease entities from raw text. Due to the lack of annotated data on food with respect to health, we explore the feasibility of transfer learning by training BERT-based models on existing datasets annotated for the presence of cause and treat relations among different types of biomedical entities, and using them to recognize the same relations between food and disease entities in a dataset created for the purposes of this study. The best models achieve macro averaged F1 scores of 0.847 and 0.900 for the cause and treat relations, respectively.