Improving Alignment in LVLMs with Debiased Self-Judgment

Sihan Yang; Chenhang Cui; Zihao Zhao; Yiyang Zhou; Weilong Yan; Ying Wei; Huaxiu Yao

doi:10.18653/v1/2025.findings-emnlp.436

Improving Alignment in LVLMs with Debiased Self-Judgment

Sihan Yang, Chenhang Cui, Zihao Zhao, Yiyang Zhou, Weilong Yan, Ying Wei, Huaxiu Yao

Abstract

The rapid advancements in Large Language Models (LLMs) and Large Visual-Language Models (LVLMs) have opened up new opportunities for integrating visual and linguistic modalities. Yet, challenges remain in aligning these modalities effectively, causing issues such as hallucinations, where generated outputs are not grounded in the visual input, and safety concerns in the application of LVLMs across various domains. Existing alignment methods, such as instruction tuning and preference tuning, often rely on external datasets, human annotations, or complex post-processing, which limit scalability and introduce additional costs. To address these challenges, we propose a novel approach that generates the debiased self-judgment score, a self-evaluation metric created internally by the model without relying on external resources. This enables the model to autonomously improve alignment. Our method enhances both decoding strategies and preference tuning processes, resulting in improved alignment, reduced hallucinations, and enhanced safety. Empirical results show that our approach significantly outperforms traditional methods, offering a more effective solution for aligning LVLMs.

Anthology ID:: 2025.findings-emnlp.436
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 8213–8232
Language:
URL:: https://aclanthology.org/2025.findings-emnlp.436/
DOI:: 10.18653/v1/2025.findings-emnlp.436
Bibkey:
Cite (ACL):: Sihan Yang, Chenhang Cui, Zihao Zhao, Yiyang Zhou, Weilong Yan, Ying Wei, and Huaxiu Yao. 2025. Improving Alignment in LVLMs with Debiased Self-Judgment. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 8213–8232, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Improving Alignment in LVLMs with Debiased Self-Judgment (Yang et al., Findings 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.findings-emnlp.436.pdf
Checklist:: 2025.findings-emnlp.436.checklist.pdf

PDF Cite Search Checklist Fix data