Multimodal Misinformation Detection by Learning from Synthetic Data with Multimodal LLMs

Fengzhu Zeng, Wenqian Li, Wei Gao, Yan Pang


Abstract
Detecting multimodal misinformation, especially in the form of image-text pairs, is crucial. Obtaining large-scale, high-quality real-world fact-checking datasets for training detectors is costly, leading researchers to use synthetic datasets generated by AI technologies. However, the generalizability of detectors trained on synthetic data to real-world scenarios remains unclear due to the distribution gap. To address this, we propose learning from synthetic data for detecting real-world multimodal misinformation through two model-agnostic data selection methods that match synthetic and real-world data distributions. Experiments show that our method enhances the performance of a small MLLM (13B) on real-world fact-checking datasets, enabling it to even surpass GPT-4V.
Anthology ID:
2024.findings-emnlp.613
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2024
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10467–10484
Language:
URL:
https://aclanthology.org/2024.findings-emnlp.613
DOI:
Bibkey:
Cite (ACL):
Fengzhu Zeng, Wenqian Li, Wei Gao, and Yan Pang. 2024. Multimodal Misinformation Detection by Learning from Synthetic Data with Multimodal LLMs. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 10467–10484, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Multimodal Misinformation Detection by Learning from Synthetic Data with Multimodal LLMs (Zeng et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-emnlp.613.pdf