Exploring LLM Annotation for Adaptation of Clinical Information Extraction Models under Data-sharing Restrictions

Seiji Shimizu; Hisada Shohei; Yutaka Uno; Shuntaro Yada; Shoko Wakamiya; Eiji Aramaki

doi:10.18653/v1/2025.findings-acl.757

Exploring LLM Annotation for Adaptation of Clinical Information Extraction Models under Data-sharing Restrictions

Seiji Shimizu, Hisada Shohei, Yutaka Uno, Shuntaro Yada, Shoko Wakamiya, Eiji Aramaki

Abstract

In-hospital text data contains valuable clinical information, yet deploying fine-tuned small language models (SLMs) for information extraction remains challenging due to differences in formatting and vocabulary across institutions. Since access to the original in-hospital data (source domain) is often restricted, annotated data from the target hospital (target domain) is crucial for domain adaptation. However, clinical annotation is notoriously expensive and time-consuming, as it demands clinical and linguistic expertise. To address this issue, we leverage large language models (LLMs) to annotate the target domain data for the adaptation. We conduct experiments on four clinical information extraction tasks, including eight target domain data. Experimental results show that LLM-annotated data consistently enhances SLM performance and, with a larger number of annotated data, outperforms manual annotation in three out of four tasks.

Anthology ID:: 2025.findings-acl.757
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 14678–14694
Language:
URL:: https://aclanthology.org/2025.findings-acl.757/
DOI:: 10.18653/v1/2025.findings-acl.757
Bibkey:
Cite (ACL):: Seiji Shimizu, Hisada Shohei, Yutaka Uno, Shuntaro Yada, Shoko Wakamiya, and Eiji Aramaki. 2025. Exploring LLM Annotation for Adaptation of Clinical Information Extraction Models under Data-sharing Restrictions. In Findings of the Association for Computational Linguistics: ACL 2025, pages 14678–14694, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Exploring LLM Annotation for Adaptation of Clinical Information Extraction Models under Data-sharing Restrictions (Shimizu et al., Findings 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.findings-acl.757.pdf

PDF Cite Search Fix data