WildGraphBench: Benchmarking GraphRAG with Wild-Source Corpora

Pengyu Wang; Benfeng Xu; Licheng Zhang; Shaohan Wang; Mingxuan Du; Chiwei Zhu; Zhendong Mao

WildGraphBench: Benchmarking GraphRAG with Wild-Source Corpora

Pengyu Wang, Benfeng Xu, Licheng Zhang, Shaohan Wang, Mingxuan Du, Chiwei Zhu, Zhendong Mao

Abstract

Graph-based Retrieval-Augmented Generation (GraphRAG) organizes external knowledge as a hierarchical graph, enabling efficient retrieval and aggregation of scattered evidence across multiple documents. However, many existing benchmarks for GraphRAG rely on short, curated passages as external knowledge, failing to adequately evaluate systems in realistic settings involving long contexts and large-scale heterogeneous documents. To bridge this gap, we introduce , a benchmark designed to assess GraphRAG performance in the wild. We leverage Wikipedia’s unique structure, where cohesive narratives are grounded in long and heterogeneous external reference documents, to construct a benchmark reflecting real-word scenarios. Specifically, we sample articles across 12 top-level topics, using their external references as the retrieval corpus and citation-linked statements as ground truth, resulting in 1,100 questions spanning three levels of complexity: single-fact QA, multi-fact QA, and section-level summarization. Experiments across multiple baselines reveal that current GraphRAG pipelines help on multi-fact aggregation when evidence comes from a moderate number of sources, but this aggregation paradigm may overemphasize high-level statements at the expense of fine-grained details, leading to weaker performance on summarization tasks.

Anthology ID:: 2026.findings-acl.679
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 13875–13890
Language:
URL:: https://aclanthology.org/2026.findings-acl.679/
DOI:
Bibkey:
Cite (ACL):: Pengyu Wang, Benfeng Xu, Licheng Zhang, Shaohan Wang, Mingxuan Du, Chiwei Zhu, and Zhendong Mao. 2026. WildGraphBench: Benchmarking GraphRAG with Wild-Source Corpora. In Findings of the Association for Computational Linguistics: ACL 2026, pages 13875–13890, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: WildGraphBench: Benchmarking GraphRAG with Wild-Source Corpora (Wang et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.679.pdf
Checklist:: 2026.findings-acl.679.checklist.pdf

PDF Cite Search Checklist Fix data