Selective Multimodal Retrieval for Automated Verification of Image–Text Claims

Yoana Tsoneva, Paul-Conrad Feig, Jiaao Li, Veronika Solopova, Neda Foroutan, Arthur Hilbert, Vera Schmitt


Abstract
This paper presents an efficiency-aware pipeline for automated fact-checking of real-world image–text claims that treats multimodality as a controllable design variable rather than a property that must be uniformly propagated through every stage of the system. The approach decomposes claims into verification questions, assigns each to text- or image-related types, and applies modality-aware retrieval strategies, while ultimately relying on text-only evidence for verdict prediction and justification generation. Evaluated on the AVerImaTeC dataset within the FEVER-9 shared task, the system achieves competitive question, evidence, verdict, and justification scores and ranks fourth overall, outperforming the official baseline on evidence recall, verdict accuracy, and justification quality despite not using visual evidence during retrieval. These results demonstrate that strong performance on multimodal fact-checking can be achieved by selectively controlling where visual information influences retrieval and reasoning, rather than performing full multimodal fusion at every stage of the pipeline.
Anthology ID:
2026.fever-1.10
Volume:
Proceedings of the Ninth Fact Extraction and VERification Workshop (FEVER)
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Mubashara Akhtar, Rami Aly, Rui Cao, Christos Christodoulopoulos, Oana Cocarascu, Zhijiang Guo, Arpit Mittal, Michael Schlichtkrull, James Thorne, Andreas Vlachos
Venues:
FEVER | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
127–135
Language:
URL:
https://aclanthology.org/2026.fever-1.10/
DOI:
Bibkey:
Cite (ACL):
Yoana Tsoneva, Paul-Conrad Feig, Jiaao Li, Veronika Solopova, Neda Foroutan, Arthur Hilbert, and Vera Schmitt. 2026. Selective Multimodal Retrieval for Automated Verification of Image–Text Claims. In Proceedings of the Ninth Fact Extraction and VERification Workshop (FEVER), pages 127–135, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
Selective Multimodal Retrieval for Automated Verification of Image–Text Claims (Tsoneva et al., FEVER 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.fever-1.10.pdf