Cassie S. Mitchell

2026

Where do LLMs currently stand on biomedical NER in both clean and noisy settings ?
Christophe Ye | Cassie S. Mitchell
Findings of the Association for Computational Linguistics: EACL 2026

Biomedical Named Entity Recognition (NER) consists of identifying and classifying important biomedical entities mentioned in text. Traditionally, biomedical NER has heavily relied on domain-specific pre-trained language models; particularly variant of BERT models. With the emergence of large language models (LLMs), some studies have evaluated their performance on biomedical NLP tasks. These studies consistently show that, despite their general capabilities, LLMs still fall short compared to specialized BERT-based models for biomedical NER. However, as LLMs continue to advance at a remarkable pace, natural questions arise: Are they still far behind, or are they starting to be competitive? In this study, we investigate the performance of recent LLMs across multiple biomedical NER datasets under both clean and noisy dataset conditions. Our findings reveal that LLMs are progressively closing the performance gap with BERT-based models and demonstrate particular strengths in low-data settings. Moreover, our results suggest that in-context learning with LLMs exhibits a notable degree of robustness to noise, making them a promising alternative in settings where labeled data is scarce or noisy.

2025

pdf bib

pdf bib abs

LLM as Entity Disambiguator for Biomedical Entity-Linking
Christophe Ye | Cassie S. Mitchell
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Entity linking involves normalizing a mention in medical text to a unique identifier in a knowledge base, such as UMLS or MeSH. Most entity linkers follow a two-stage process: first, a candidate generation step selects high-quality candidates, and then a named entity disambiguation phase determines the best candidate for final linking. This study demonstrates that leveraging a large language model (LLM) as an entity disambiguator significantly enhances entity linking models’ accuracy and recall. Specifically, the LLM disambiguator achieves remarkable improvements when applied to alias-matching entity linking methods. Without any fine-tuning, our approach establishes a new state-of-the-art (SOTA), surpassing previous methods on multiple prevalent biomedical datasets by up to 16 points in accuracy. We released our code on GitHub at https://github.com/ChristopheYe/llm_disambiguator

Co-authors

Venues

Findings2
ACL1

Fix author