BioCoref: Benchmarking Biomedical Coreference Resolution with LLMs

Nourah Salem; Elizabeth White; Michael Bada; Lawrence Hunter

BioCoref: Benchmarking Biomedical Coreference Resolution with LLMs

Nourah Salem, Elizabeth White, Michael Bada, Lawrence Hunter

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

Coreference resolution in biomedical texts presents unique challenges due to complex domain-specific terminology, high ambiguity in mention forms, and long-distance dependencies between coreferring expressions. In this work, we present a comprehensive evaluation of generative large language models (LLMs) for coreference resolution in the biomedical domain. Using the CRAFT corpus as our benchmark, we assess the LLMs’ performance with four prompting experiments that vary in their use of local, contextual enrichment, and domain-specific cues such as abbreviations and entity dictionaries.

Anthology ID:: 2026.bionlp-1.42
Volume:: BioNLP 2026
Month:: July
Year:: 2026
Address:: San Diego, California
Editors:: Dina Demner-Fushman, Sophia Ananiadou, Kirk Roberts, Junichi Tsujii
Venues:: BioNLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 519–530
Language:
URL:: https://aclanthology.org/2026.bionlp-1.42/
DOI:
Bibkey:
Cite (ACL):: Nourah Salem, Elizabeth White, Michael Bada, and Lawrence Hunter. 2026. BioCoref: Benchmarking Biomedical Coreference Resolution with LLMs. In BioNLP 2026, pages 519–530, San Diego, California. Association for Computational Linguistics.
Cite (Informal):: BioCoref: Benchmarking Biomedical Coreference Resolution with LLMs (Salem et al., BioNLP 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.bionlp-1.42.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{salem-etal-2026-biocoref,
    title = "{B}io{C}oref: Benchmarking Biomedical Coreference Resolution with {LLM}s",
    author = "Salem, Nourah  and
      White, Elizabeth  and
      Bada, Michael  and
      Hunter, Lawrence",
    editor = "Demner-Fushman, Dina  and
      Ananiadou, Sophia  and
      Roberts, Kirk  and
      Tsujii, Junichi",
    booktitle = "{B}io{NLP} 2026",
    month = jul,
    year = "2026",
    address = "San Diego, California",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2026.bionlp-1.42/",
    pages = "519--530",
    ISBN = "979-8-89176-434-7",
    abstract = "Coreference resolution in biomedical texts presents unique challenges due to complex domain-specific terminology, high ambiguity in mention forms, and long-distance dependencies between coreferring expressions. In this work, we present a comprehensive evaluation of generative large language models (LLMs) for coreference resolution in the biomedical domain. Using the CRAFT corpus as our benchmark, we assess the LLMs' performance with four prompting experiments that vary in their use of local, contextual enrichment, and domain-specific cues such as abbreviations and entity dictionaries."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="salem-etal-2026-biocoref">
    <titleInfo>
        <title>BioCoref: Benchmarking Biomedical Coreference Resolution with LLMs</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Nourah</namePart>
        <namePart type="family">Salem</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Elizabeth</namePart>
        <namePart type="family">White</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Michael</namePart>
        <namePart type="family">Bada</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Lawrence</namePart>
        <namePart type="family">Hunter</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2026-07</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>BioNLP 2026</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Dina</namePart>
            <namePart type="family">Demner-Fushman</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Sophia</namePart>
            <namePart type="family">Ananiadou</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Kirk</namePart>
            <namePart type="family">Roberts</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Junichi</namePart>
            <namePart type="family">Tsujii</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">San Diego, California</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
        <identifier type="isbn">979-8-89176-434-7</identifier>
    </relatedItem>
    <abstract>Coreference resolution in biomedical texts presents unique challenges due to complex domain-specific terminology, high ambiguity in mention forms, and long-distance dependencies between coreferring expressions. In this work, we present a comprehensive evaluation of generative large language models (LLMs) for coreference resolution in the biomedical domain. Using the CRAFT corpus as our benchmark, we assess the LLMs’ performance with four prompting experiments that vary in their use of local, contextual enrichment, and domain-specific cues such as abbreviations and entity dictionaries.</abstract>
    <identifier type="citekey">salem-etal-2026-biocoref</identifier>
    <location>
        <url>https://aclanthology.org/2026.bionlp-1.42/</url>
    </location>
    <part>
        <date>2026-07</date>
        <extent unit="page">
            <start>519</start>
            <end>530</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T BioCoref: Benchmarking Biomedical Coreference Resolution with LLMs
%A Salem, Nourah
%A White, Elizabeth
%A Bada, Michael
%A Hunter, Lawrence
%Y Demner-Fushman, Dina
%Y Ananiadou, Sophia
%Y Roberts, Kirk
%Y Tsujii, Junichi
%S BioNLP 2026
%D 2026
%8 July
%I Association for Computational Linguistics
%C San Diego, California
%@ 979-8-89176-434-7
%F salem-etal-2026-biocoref
%X Coreference resolution in biomedical texts presents unique challenges due to complex domain-specific terminology, high ambiguity in mention forms, and long-distance dependencies between coreferring expressions. In this work, we present a comprehensive evaluation of generative large language models (LLMs) for coreference resolution in the biomedical domain. Using the CRAFT corpus as our benchmark, we assess the LLMs’ performance with four prompting experiments that vary in their use of local, contextual enrichment, and domain-specific cues such as abbreviations and entity dictionaries.
%U https://aclanthology.org/2026.bionlp-1.42/
%P 519-530

Download as File

Markdown (Informal)

[BioCoref: Benchmarking Biomedical Coreference Resolution with LLMs](https://aclanthology.org/2026.bionlp-1.42/) (Salem et al., BioNLP 2026)

BioCoref: Benchmarking Biomedical Coreference Resolution with LLMs (Salem et al., BioNLP 2026)

ACL

Nourah Salem, Elizabeth White, Michael Bada, and Lawrence Hunter. 2026. BioCoref: Benchmarking Biomedical Coreference Resolution with LLMs. In BioNLP 2026, pages 519–530, San Diego, California. Association for Computational Linguistics.