Evaluating Pre-Trained Language Models for Focused Terminology Extraction from Swedish Medical Records

Oskar Jerdhaf, Marina Santini, Peter Lundberg, Tomas Bjerner, Yosef Al-Abasse, Arne Jonsson, Thomas Vakili


Abstract
In the experiments briefly presented in this abstract, we compare the performance of a generalist Swedish pre-trained language model with a domain-specific Swedish pre-trained model on the downstream task of focussed terminology extraction of implant terms, which are terms that indicate the presence of implants in the body of patients. The fine-tuning is identical for both models. For the search strategy we rely on KD-Tree that we feed with two different lists of term seeds, one with noise and one without noise. Results shows that the use of a domain-specific pre-trained language model has a positive impact on focussed terminology extraction only when using term seeds without noise.
Anthology ID:
2022.term-1.6
Volume:
Proceedings of the Workshop on Terminology in the 21st century: many faces, many places
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Rute Costa, Sara Carvalho, Ana Ostroški Anić, Anas Fahad Khan
Venue:
TERM
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
30–32
Language:
URL:
https://aclanthology.org/2022.term-1.6
DOI:
Bibkey:
Cite (ACL):
Oskar Jerdhaf, Marina Santini, Peter Lundberg, Tomas Bjerner, Yosef Al-Abasse, Arne Jonsson, and Thomas Vakili. 2022. Evaluating Pre-Trained Language Models for Focused Terminology Extraction from Swedish Medical Records. In Proceedings of the Workshop on Terminology in the 21st century: many faces, many places, pages 30–32, Marseille, France. European Language Resources Association.
Cite (Informal):
Evaluating Pre-Trained Language Models for Focused Terminology Extraction from Swedish Medical Records (Jerdhaf et al., TERM 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.term-1.6.pdf