SubmissionNumber#=%=#38 FinalPaperTitle#=%=#How to Efficiently Explore Noisy Historical Data? Leveraging Corpus Pre-Targeting to Enhance Graph-based RAG ShortPaperTitle#=%=# NumberOfPages#=%=#10 CopyrightSigned#=%=#Donghan Bian JobTitle#==# Organization#==# Abstract#==#Graph-based Retrieval-Augmented Generation (RAG) is increasingly used to explore long, heterogeneous, and weakly structured corpora, including historical archives. However, in such settings, naive full-corpus indexing is often computationally costly and sensitive to OCR noise, document redundancy, and topical dispersion. In this paper, we investigate corpus pre-targeting strategies as an intermediate layer to improve the efficiency and effectiveness of graph-based RAG for historical research. We evaluate a set of pre-targeting heuristics tailored to single-hop and multi-hop of historical questions on HistoriQA-ThirdRepublic, a French question-answering dataset derived from parliamentary debates and contemporary newspapers. Our results show that appropriate pre-targeting strategies can improve retrieval recall by 3–5% while reducing token consumption by 32–37% compared to full-corpus indexing, without degrading coverage of relevant documents. Beyond performance gains, this work highlights the importance of corpus-level optimization for applying RAG to large-scale historical collections, and provides practical insights for adapting graph-based RAG pipelines to the specific constraints of digitized archives. Author{1}{Firstname}#=%=#Donghan Author{1}{Lastname}#=%=#Bian Author{1}{Username}#=%=#kepler9 Author{1}{Orcid}#=%=# Author{1}{Email}#=%=#donghan.bian@chartes.psl.eu Author{1}{Affiliation}#=%=#Ecole Nationale des Chartes Author{2}{Firstname}#=%=#Marie Author{2}{Lastname}#=%=#Puren Author{2}{Orcid}#=%=# Author{2}{Email}#=%=#marie.puren@epita.fr Author{2}{Affiliation}#=%=#EPITA Author{3}{Firstname}#=%=#Florian Author{3}{Lastname}#=%=#Cafiero Author{3}{Orcid}#=%=# Author{3}{Email}#=%=#florian.cafiero@chartes.psl.eu Author{3}{Affiliation}#=%=#Ecole Nationale des Chartes ========== èéáğö