Ruoyu Wang


2025

pdf bib
Mitigating Visual Knowledge Forgetting in MLLM Instruction-tuning via Modality-decoupled Gradient Descent
Junda Wu | Yuxin Xiong | Xintong Li | Yu Xia | Ruoyu Wang | Yu Wang | Tong Yu | Sungchul Kim | Ryan A. Rossi | Lina Yao | Jingbo Shang | Julian McAuley
Findings of the Association for Computational Linguistics: EMNLP 2025

Recent MLLMs have demonstrated strong visual understanding and reasoning after large-scale multimodal pre-training. However, instruction-tuning is typically text-driven with limited visual supervision, leading to significant visual forgetting and degradation of pre-trained visual knowledge. Existing fine-tuning and continual learning methods compress visual representations and emphasize task alignment over visual retention, failing to address this challenge. We present a novel perspective using effective rank to quantify the loss of visual representation richness, framing visual forgetting as excessive compression under the information bottleneck principle. To address this, we propose modality-decoupled gradient descent (MDGD), which regulates gradient updates to preserve the effective rank of visual features and explicitly disentangles visual learning from task-specific alignment. We further introduce a memory-efficient fine-tuning variant using gradient masking for parameter-efficient adaptation. Extensive experiments show that MDGD effectively mitigates visual forgetting across downstream tasks and models, maintaining pre-trained visual knowledge while supporting strong task adaptation.

2021

pdf bib
Constructing Flow Graphs from Procedural Cybersecurity Texts
Kuntal Kumar Pal | Kazuaki Kashihara | Pratyay Banerjee | Swaroop Mishra | Ruoyu Wang | Chitta Baral
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021