MODE-RAG: Manifold Outlier Diagnosis and Energy-based Retrieval-Augmented Generation Evaluation

Zehang Wei; JiaXin Dai; Jiamin Yan; Xiang Xiang

MODE-RAG: Manifold Outlier Diagnosis and Energy-based Retrieval-Augmented Generation Evaluation

Zehang Wei, JiaXin Dai, Jiamin Yan, Xiang Xiang

Abstract

While Multimodal Retrieval-Augmented Generation (M-RAG) enhances Large Vision-Language Models, it remains highly susceptible to cross-modal hallucinations, causal fabrications, and sycophancy. Furthermore, existing mitigation pipelines often face an intervention paradox: static rules tend to unnecessarily disrupt accurate generations, whereas leaving the multi-modal reasoning completely unguided allows existing mismatches to cascade into severe logical fabrications. To quantify and mitigate these hallucinations, we propose a Multi-Agent system, MODE-RAG, driven by Variational Free Energy (VFE) and internal attention states to dynamically gate interventions. High-risk queries are routed to five stage-specific agents, integrating Monte Carlo Tree Search (MCTS) for rigorous causal derivation and logit perturbations to penalize sycophancy. Dedicated Correction and Overseer agents ensure formatting stability and perform post-hoc factual verification. To objectively evaluate our approach, we introduce ModeVent, a challenging subset derived from the MultiVent dataset. Extensive experiments indicate that our system effectively reduces hallucination rates and logical fabrication, significantly improving the robustness of M-RAG systems.

Anthology ID:: 2026.magmar-main.6
Volume:: Proceedings of the 2nd Workshop on Multimodal Augmented Generation via Multimodal Retrieval (MAGMaR 2026)
Month:: July
Year:: 2026
Address:: San Diego, USA
Editors:: Kenton Murray, Reno Kriz
Venues:: MAGMaR | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 11–26
Language:
URL:: https://aclanthology.org/2026.magmar-main.6/
DOI:
Bibkey:
Cite (ACL):: Zehang Wei, JiaXin Dai, Jiamin Yan, and Xiang Xiang. 2026. MODE-RAG: Manifold Outlier Diagnosis and Energy-based Retrieval-Augmented Generation Evaluation. In Proceedings of the 2nd Workshop on Multimodal Augmented Generation via Multimodal Retrieval (MAGMaR 2026), pages 11–26, San Diego, USA. Association for Computational Linguistics.
Cite (Informal):: MODE-RAG: Manifold Outlier Diagnosis and Energy-based Retrieval-Augmented Generation Evaluation (Wei et al., MAGMaR 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.magmar-main.6.pdf

PDF Cite Search Fix data