Zhaoyang Wang


2025

pdf bib
Enhancing Visual-Language Modality Alignment in Large Vision Language Models via Self-Improvement
Xiyao Wang | Jiuhai Chen | Zhaoyang Wang | Yuhang Zhou | Yiyang Zhou | Huaxiu Yao | Tianyi Zhou | Tom Goldstein | Parminder Bhatia | Taha Kass-Hout | Furong Huang | Cao Xiao
Findings of the Association for Computational Linguistics: NAACL 2025

Large vision-language models (LVLMs) have achieved impressive results in visual question-answering and reasoning tasks through vision instruction tuning on specific datasets. However, there remains significant room for improvement in aligning visual and language modalities. Existing methods often depend on external models or data, leading to uncontrollable and unstable alignment results. In this paper, we propose SIMA, a self-improvement framework that enhances visual and language modality alignment without external dependencies. SIMA leverages existing vision instruction tuning datasets to self-generate responses, incorporating an in-context self-critic mechanism that constructs preference pairs for tuning. Crucially, our approach allows LVLMs to act as critics by designing effective critic prompts, eliminating the need for additional fine-tuning with external instruction data. We introduce three novel visual metrics within the self-critic process to guide judgement, significantly improving the accuracy of self-critic. Through extensive experiments across 14 hallucination and comprehensive benchmarks, we demonstrate that SIMA significantly improves LVLM’s performance and outperforms previous approaches, achieving superior modality alignment.

pdf bib
Verifiable Format Control for Large Language Model Generations
Zhaoyang Wang | Jinqi Jiang | Huichi Zhou | Wenhao Zheng | Xuchao Zhang | Chetan Bansal | Huaxiu Yao
Findings of the Association for Computational Linguistics: NAACL 2025

Recent Large Language Models (LLMs) have demonstrated satisfying general instruction following ability. However, small LLMs with about 7B parameters still struggle fine-grained format following (e.g., JSON format), which seriously hinder the advancements of their applications. Most existing methods focus on benchmarking general instruction following while overlook how to improve the specific format following ability for small LLMs. Besides, these methods often rely on evaluations based on advanced LLMs (e.g., GPT-4), which can introduce the intrinsic bias of LLMs and be costly due to the API calls. In this paper, we first curate a fully verifiable format following dataset VFF. In contrast to existing works often adopting external LLMs for instruction-following validations, every sample of VFF can be easily validated with a Python function. Further, we propose to leverage this verifiable feature to synthesize massive data for progressively training small LLMs, in order to improve their format following abilities. Experimental results highlight the prevalent limitations in the format following capabilities of 7B level open-source LLMs and demonstrate the effectiveness of our method in enhancing this essential ability.

2024

pdf bib
Evaluating the Validity of Word-level Adversarial Attacks with Large Language Models
Huichi Zhou | Zhaoyang Wang | Hongtao Wang | Dongping Chen | Wenhan Mu | Fangyuan Zhang
Findings of the Association for Computational Linguistics: ACL 2024

Deep neural networks exhibit vulnerability to word-level adversarial attacks in natural language processing. Most of these attack methods adopt synonymous substitutions to perturb original samples for crafting adversarial examples while attempting to maintain semantic consistency with the originals. Some of them claim that they could achieve over 90% attack success rate, thereby raising serious safety concerns. However, our investigation reveals that many purportedly successful adversarial examples are actually invalid due to significant changes in semantic meanings compared to their originals. Even when equipped with semantic constraints such as BERTScore, existing attack methods can generate up to 87.9% invalid adversarial examples. Building on this insight, we first curate a 13K dataset for adversarial validity evaluation with the help of GPT-4. Then, an open-source large language model is fine-tuned to offer an interpretable validity score for assessing the semantic consistency between original and adversarial examples. Finally, this validity score can serve as a guide for existing adversarial attack methods to generate valid adversarial examples. Comprehensive experiments demonstrate the effectiveness of our method in evaluating and refining the quality of adversarial examples.

2023

pdf bib
RMLM: A Flexible Defense Framework for Proactively Mitigating Word-level Adversarial Attacks
Zhaoyang Wang | Zhiyue Liu | Xiaopeng Zheng | Qinliang Su | Jiahai Wang
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Adversarial attacks on deep neural networks keep raising security concerns in natural language processing research. Existing defenses focus on improving the robustness of the victim model in the training stage. However, they often neglect to proactively mitigate adversarial attacks during inference. Towards this overlooked aspect, we propose a defense framework that aims to mitigate attacks by confusing attackers and correcting adversarial contexts that are caused by malicious perturbations. Our framework comprises three components: (1) a synonym-based transformation to randomly corrupt adversarial contexts in the word level, (2) a developed BERT defender to correct abnormal contexts in the representation level, and (3) a simple detection method to filter out adversarial examples, any of which can be flexibly combined. Additionally, our framework helps improve the robustness of the victim model during training. Extensive experiments demonstrate the effectiveness of our framework in defending against word-level adversarial attacks.

pdf bib
Democratizing Reasoning Ability: Tailored Learning from Large Language Model
Zhaoyang Wang | Shaohan Huang | Yuxuan Liu | Jiahai Wang | Minghui Song | Zihan Zhang | Haizhen Huang | Furu Wei | Weiwei Deng | Feng Sun | Qi Zhang
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Large language models (LLMs) exhibit impressive emergent abilities in natural language processing, but their democratization is hindered due to huge computation requirements and closed-source nature. Recent research on advancing open-source smaller LMs by distilling knowledge from black-box LLMs has obtained promising results in the instruction-following ability. However, the reasoning ability which is more challenging to foster, is relatively rarely explored. In this paper, we propose a tailored learning approach to distill such reasoning ability to smaller LMs to facilitate the democratization of the exclusive reasoning ability. In contrast to merely employing LLM as a data annotator, we exploit the potential of LLM as a reasoning teacher by building an interactive multi-round learning paradigm. This paradigm enables the student to expose its deficiencies to the black-box teacher who then can provide customized training data in return. Further, to exploit the reasoning potential of the smaller LM, we propose self-reflection learning to motivate the student to learn from self-made mistakes. The learning from self-reflection and LLM are all tailored to the student’s learning status, thanks to the seamless integration with the multi-round learning paradigm. Comprehensive experiments and analysis on mathematical and commonsense reasoning tasks demonstrate the effectiveness of our method. The code will be available at https://github.com/Raibows/Learn-to-Reason.

2022

pdf bib
UECA-Prompt: Universal Prompt for Emotion Cause Analysis
Xiaopeng Zheng | Zhiyue Liu | Zizhen Zhang | Zhaoyang Wang | Jiahai Wang
Proceedings of the 29th International Conference on Computational Linguistics

Emotion cause analysis (ECA) aims to extract emotion clauses and find the corresponding cause of the emotion. Existing methods adopt fine-tuning paradigm to solve certain types of ECA tasks. These task-specific methods have a deficiency of universality. And the relations among multiple objectives in one task are not explicitly modeled. Moreover, the relative position information introduced in most existing methods may make the model suffer from dataset bias. To address the first two problems, this paper proposes a universal prompt tuning method to solve different ECA tasks in the unified framework. As for the third problem, this paper designs a directional constraint module and a sequential learning module to ease the bias. Considering the commonalities among different tasks, this paper proposes a cross-task training method to further explore the capability of the model. The experimental results show that our method achieves competitive performance on the ECA datasets.