Haotian Sun
2024
MedAdapter: Efficient Test-Time Adaptation of Large Language Models Towards Medical Reasoning
Wenqi Shi
|
Ran Xu
|
Yuchen Zhuang
|
Yue Yu
|
Haotian Sun
|
Hang Wu
|
Carl Yang
|
May Dongmei Wang
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Despite their improved capabilities in generation and reasoning, adapting large language models (LLMs) to the biomedical domain remains challenging due to their immense size and privacy concerns. In this study, we propose MedAdapter, a unified post-hoc adapter for test-time adaptation of LLMs towards biomedical applications. Instead of fine-tuning the entire LLM, MedAdapter effectively adapts the original model by fine-tuning only a small BERT-sized adapter to rank candidate solutions generated by LLMs. Experiments on four biomedical tasks across eight datasets demonstrate that MedAdapter effectively adapts both white-box and black-box LLMs in biomedical reasoning, achieving average performance improvements of 18.24% and 10.96%, respectively, without requiring extensive computational resources or sharing data with third parties. MedAdapter also yields enhanced performance when combined with train-time adaptation, highlighting a flexible and complementary solution to existing adaptation methods. Faced with the challenges of balancing model performance, computational resources, and data privacy, MedAdapter provides an efficient, privacy-preserving, cost-effective, and transparent solution for adapting LLMs to the biomedical domain.
2008
ANNALIST - ANNotation ALIgnment and Scoring Tool
George Demetriou
|
Robert Gaizauskas
|
Haotian Sun
|
Angus Roberts
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
In this paper we describe ANNALIST (Annotation, Alignment and Scoring Tool), a scoring system for the evaluation of the output of semantic annotation systems. ANNALIST has been designed as a system that is easily extensible and configurable for different domains, data formats, and evaluation tasks. The system architecture enables data input via the use of plugins and the users can access the systems internal alignment and scoring mechanisms without the need to convert their data to a specified format. Although developed for evaluation tasks that involve the scoring of entity mentions and relations primarily, ANNALISTs generic object representation and the availability of a range of criteria for the comparison of annotations enable the system to be tailored to a variety of scoring jobs. The paper reports on results from using ANNALIST in real-world situations in comparison to other scorers which are more established in the literature. ANNALIST has been used extensively for evaluation tasks within the VIKEF (EU FP6) and CLEF (UK MRC) projects.