FinMRAGBench: A Realistic and Complex Benchmark for Multi-Modal RAG in Financial Document Analysis

Shouqing Yang; Qi Zhang; Yuhang Yang; Ruikang Xu; Yuwei Hou; Zhulin Jia; Lirong Gao; Haobo Wang; Jinglei Chen; Jiexiang Wang; Sheng Guo; Bo Zheng; Gang Chen

FinMRAGBench: A Realistic and Complex Benchmark for Multi-Modal RAG in Financial Document Analysis

Shouqing Yang, Qi Zhang, Yuhang Yang, Ruikang Xu, Yuwei Hou, Zhulin Jia, Lirong Gao, Haobo Wang, Jinglei Chen, Jiexiang Wang, Sheng Guo, Bo Zheng, Gang Chen

Abstract

Retrieval-augmented generation (RAG) has become a widely adopted paradigm for realistic financial analysis over financial documents. However, existing benchmarks fail to capture realistic financial analysis settings that involve cross-document retrieval, multi-page evidence integration, and diverse analytical tasks. To address this gap, we introduce FinMRAGBench, a comprehensive multi-modal financial RAG benchmark in which most questions require retrieving evidence scattered across multiple pages and documents, constructed from large-scale real-world annual reports and comprising 887 expert-verified QA pairs spanning five representative financial analysis tasks. Moreover, we introduce FinMRAGAgent, an agent trained on high-quality agentic trajectories following the reasoning-and-acting (ReAct) paradigm, capable of dynamic tool invocation and multi-step financial analysis. Our extensive experiments show that current multi-modal RAG systems still struggle with incomplete retrieval and complex financial reasoning. In contrast, FinMRAGAgent achieves the strongest overall performance across all models, demonstrating that our structured reasoning approach significantly enhances multi-modal RAG in realistic financial scenarios. The code and data are available at https://github.com/sqyangit/FinMRAGBench.

Anthology ID:: 2026.findings-acl.187
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3822–3867
Language:
URL:: https://aclanthology.org/2026.findings-acl.187/
DOI:
Bibkey:
Cite (ACL):: Shouqing Yang, Qi Zhang, Yuhang Yang, Ruikang Xu, Yuwei Hou, Zhulin Jia, Lirong Gao, Haobo Wang, Jinglei Chen, Jiexiang Wang, Sheng Guo, Bo Zheng, and Gang Chen. 2026. FinMRAGBench: A Realistic and Complex Benchmark for Multi-Modal RAG in Financial Document Analysis. In Findings of the Association for Computational Linguistics: ACL 2026, pages 3822–3867, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: FinMRAGBench: A Realistic and Complex Benchmark for Multi-Modal RAG in Financial Document Analysis (Yang et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.187.pdf
Checklist:: 2026.findings-acl.187.checklist.pdf

PDF Cite Search Checklist Fix data