M2Edit: Locate and Edit Multi-Granularity Knowledge in Multimodal Large Language Model

Yang Zhou; Pengfei Cao; Yubo Chen (陈玉博); Qingbin Liu; Dianbo Sui; Xi Chen; Kang Liu; Jun Zhao

doi:10.18653/v1/2025.emnlp-main.1478

M2Edit: Locate and Edit Multi-Granularity Knowledge in Multimodal Large Language Model

Yang Zhou, Pengfei Cao, Yubo Chen, Qingbin Liu, Dianbo Sui, Xi Chen, Kang Liu, Jun Zhao

Abstract

Multimodal knowledge editing is an important method for modifying outdated or incorrect knowledge in Multimodal Large Language Models (MLLMs). However, existing datasets for multimodal knowledge editing lack multi-granularity knowledge. In this paper, we present a more realistic dataset called M2Edit, which includes three distinct types of knowledge: entity, relation, and action. Additionally, existing knowledge editing methods for MLLMs lack the ability to handle multi-granularity knowledge and generalize to multimodal data. To address these limitations, we propose the multimodal knowledge editing method MLE. This approach identifies key knowledge layers within different components and collaboratively edits the various components of MLLMs. As a result, we observe significant improvements in visual generality performance, ranging from 4.8 to 10.8, and achieve the best overall performance on knowledge data of different granularities.

Anthology ID:: 2025.emnlp-main.1478
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 29029–29042
Language:
URL:: https://aclanthology.org/2025.emnlp-main.1478/
DOI:: 10.18653/v1/2025.emnlp-main.1478
Bibkey:
Cite (ACL):: Yang Zhou, Pengfei Cao, Yubo Chen, Qingbin Liu, Dianbo Sui, Xi Chen, Kang Liu, and Jun Zhao. 2025. M2Edit: Locate and Edit Multi-Granularity Knowledge in Multimodal Large Language Model. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 29029–29042, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: M2Edit: Locate and Edit Multi-Granularity Knowledge in Multimodal Large Language Model (Zhou et al., EMNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.emnlp-main.1478.pdf
Checklist:: 2025.emnlp-main.1478.checklist.pdf

PDF Cite Search Checklist Fix data