A benchmark for end-to-end zero-shot biomedical relation extraction with LLMs: experiments with OpenAI models

Aviv Brokman; Xuguang Ai; Yuhang Jiang; Shashank Gupta; Ramakanth Kavuluru

A benchmark for end-to-end zero-shot biomedical relation extraction with LLMs: experiments with OpenAI models

Aviv Brokman, Xuguang Ai, Yuhang Jiang, Shashank Gupta, Ramakanth Kavuluru

Abstract

Extracting relations from scientific literature is a fundamental task in biomedical NLP because entities and relations among them drive hypothesis generation and knowledge discovery. As literature grows rapidly, relation extraction (RE) is indispensable to curate knowledge graphs to be used as computable structured and symbolic representations. With the rise of LLMs, it is pertinent to examine if it is better to skip tailoring supervised RE methods, save annotation burden, and just use zero shot RE (ZSRE) via LLM API calls. In this paper, we propose a benchmark with seven biomedical RE datasets with interesting characteristics and evaluate three Open AI models (GPT-4, o1, and GPT-OSS-120B) for end-to-end ZSRE. We show that LLM-based ZSRE is inching closer to supervised methods in performances on some datasets but still struggles on complex inputs expressing multiple relations with different predicates. Our error analysis reveals scope for improvements.

Anthology ID:: 2025.wasp-main.6
Volume:: Proceedings of the Third Workshop for Artificial Intelligence for Scientific Publications
Month:: December
Year:: 2025
Address:: Mumbai, India and virtual
Editors:: Alberto Accomazzi, Tirthankar Ghosal, Felix Grezes, Kelly Lockhart
Venues:: WASP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 44–55
Language:
URL:: https://aclanthology.org/2025.wasp-main.6/
DOI:
Bibkey:
Cite (ACL):: Aviv Brokman, Xuguang Ai, Yuhang Jiang, Shashank Gupta, and Ramakanth Kavuluru. 2025. A benchmark for end-to-end zero-shot biomedical relation extraction with LLMs: experiments with OpenAI models. In Proceedings of the Third Workshop for Artificial Intelligence for Scientific Publications, pages 44–55, Mumbai, India and virtual. Association for Computational Linguistics.
Cite (Informal):: A benchmark for end-to-end zero-shot biomedical relation extraction with LLMs: experiments with OpenAI models (Brokman et al., WASP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.wasp-main.6.pdf

PDF Cite Search Fix data