Intermediate Domain Finetuning for Weakly Supervised Domain-adaptive Clinical NER

Shilpa Suresh, Nazgol Tavabi, Shahriar Golchin, Leah Gilreath, Rafael Garcia-Andujar, Alexander Kim, Joseph Murray, Blake Bacevich, Ata Kiapour


Abstract
Accurate human-annotated data for real-worlduse cases can be scarce and expensive to obtain. In the clinical domain, obtaining such data is evenmore difficult due to privacy concerns which notonly restrict open access to quality data but also require that the annotation be done by domain experts. In this paper, we propose a novel framework - InterDAPT - that leverages Intermediate Domain Finetuning to allow language models to adapt to narrow domains with small, noisy datasets. By making use of peripherally-related, unlabeled datasets,this framework circumvents domain-specific datascarcity issues. Our results show that this weaklysupervised framework provides performance improvements in downstream clinical named entityrecognition tasks.
Anthology ID:
2023.bionlp-1.29
Volume:
The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Dina Demner-fushman, Sophia Ananiadou, Kevin Cohen
Venue:
BioNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
320–325
Language:
URL:
https://aclanthology.org/2023.bionlp-1.29
DOI:
10.18653/v1/2023.bionlp-1.29
Bibkey:
Cite (ACL):
Shilpa Suresh, Nazgol Tavabi, Shahriar Golchin, Leah Gilreath, Rafael Garcia-Andujar, Alexander Kim, Joseph Murray, Blake Bacevich, and Ata Kiapour. 2023. Intermediate Domain Finetuning for Weakly Supervised Domain-adaptive Clinical NER. In The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, pages 320–325, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Intermediate Domain Finetuning for Weakly Supervised Domain-adaptive Clinical NER (Suresh et al., BioNLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.bionlp-1.29.pdf
Video:
 https://aclanthology.org/2023.bionlp-1.29.mp4