Multimodal Misinformation Detection by Learning from Synthetic Data with Multimodal LLMs

Fengzhu Zeng; Wenqian Li; Wei Gao; Yan Pang

Multimodal Misinformation Detection by Learning from Synthetic Data with Multimodal LLMs

Fengzhu Zeng, Wenqian Li, Wei Gao, Yan Pang

Abstract

Detecting multimodal misinformation, especially in the form of image-text pairs, is crucial. Obtaining large-scale, high-quality real-world fact-checking datasets for training detectors is costly, leading researchers to use synthetic datasets generated by AI technologies. However, the generalizability of detectors trained on synthetic data to real-world scenarios remains unclear due to the distribution gap. To address this, we propose learning from synthetic data for detecting real-world multimodal misinformation through two model-agnostic data selection methods that match synthetic and real-world data distributions. Experiments show that our method enhances the performance of a small MLLM (13B) on real-world fact-checking datasets, enabling it to even surpass GPT-4V.

Anthology ID:: 2024.findings-emnlp.613
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2024
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 10467–10484
Language:
URL:: https://aclanthology.org/2024.findings-emnlp.613
DOI:
Bibkey:
Cite (ACL):: Fengzhu Zeng, Wenqian Li, Wei Gao, and Yan Pang. 2024. Multimodal Misinformation Detection by Learning from Synthetic Data with Multimodal LLMs. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 10467–10484, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: Multimodal Misinformation Detection by Learning from Synthetic Data with Multimodal LLMs (Zeng et al., Findings 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.findings-emnlp.613.pdf

PDF Cite Search