Yosef Al-Abasse


2022

pdf bib
Classifying Implant-Bearing Patients via their Medical Histories: a Pre-Study on Swedish EMRs with Semi-Supervised GanBERT
Benjamin Danielsson | Marina Santini | Peter Lundberg | Yosef Al-Abasse | Arne Jonsson | Emma Eneling | Magnus Stridsman
Proceedings of the Thirteenth Language Resources and Evaluation Conference

In this paper, we compare the performance of two BERT-based text classifiers whose task is to classify patients (more precisely, their medical histories) as having or not having implant(s) in their body. One classifier is a fully-supervised BERT classifier. The other one is a semi-supervised GAN-BERT classifier. Both models are compared against a fully-supervised SVM classifier. Since fully-supervised classification is expensive in terms of data annotation, with the experiments presented in this paper, we investigate whether we can achieve a competitive performance with a semi-supervised classifier based only on a small amount of annotated data. Results are promising and show that the semi-supervised classifier has a competitive performance with the fully-supervised classifier.

pdf bib
Evaluating Pre-Trained Language Models for Focused Terminology Extraction from Swedish Medical Records
Oskar Jerdhaf | Marina Santini | Peter Lundberg | Tomas Bjerner | Yosef Al-Abasse | Arne Jonsson | Thomas Vakili
Proceedings of the Workshop on Terminology in the 21st century: many faces, many places

In the experiments briefly presented in this abstract, we compare the performance of a generalist Swedish pre-trained language model with a domain-specific Swedish pre-trained model on the downstream task of focussed terminology extraction of implant terms, which are terms that indicate the presence of implants in the body of patients. The fine-tuning is identical for both models. For the search strategy we rely on KD-Tree that we feed with two different lists of term seeds, one with noise and one without noise. Results shows that the use of a domain-specific pre-trained language model has a positive impact on focussed terminology extraction only when using term seeds without noise.