Guodong Zheng
2025
Mitigating Object Hallucinations in MLLMs via Multi-Frequency Perturbations
Shuo Li
|
Jiajun Sun
|
Guodong Zheng
|
Xiaoran Fan
|
Yujiong Shen
|
Yi Lu
|
Zhiheng Xi
|
Yuming Yang
|
Wenming Tan
|
Tao Ji
|
Tao Gui
|
Qi Zhang
|
Xuanjing Huang
Findings of the Association for Computational Linguistics: EMNLP 2025
Recently, multimodal large language models (MLLMs) have demonstrated remarkable performance in visual-language tasks. However, the authenticity of the responses generated by MLLMs is often compromised by object hallucinations. We identify that a key cause of these hallucinations is the model’s over-susceptibility to image frequency features in detecting objects. In this paper, we introduce Multi-Frequency Perturbations (MFP), a simple, cost-effective, and pluggable adversarial training method that leverages both low-frequency and high-frequency features of images to perturb visual feature representations and explicitly suppress redundant frequency-domain features during inference, thereby mitigating hallucinations. Experimental results demonstrate that our method significantly mitigates object hallucinations across various model architectures. Furthermore, as a training-time method, MFP can be combined with inference-time methods to achieve state-of-the-art performance on the CHAIR benchmark.
Search
Fix author
Co-authors
- Xiaoran Fan 1
- Tao Gui 1
- Xuan-Jing Huang (黄萱菁) 1
- Tao Ji 1
- Shuo Li 1
- show all...