Multi-Modal Entities Matter: Benchmarking Multi-Modal Entity Alignment

GuanChen Xiao, WeiXin Zeng, ShiQi Zhang, MingRui Lao, Xiang Zhao


Abstract
Multi-modal entity alignment (MMEA) is a long-standing task that aims to discover identical entities between different multi-modal knowledge graphs (MMKGs). However, most of the existing MMEA datasets consider the multi-modal data as the attributes of textual entities, while neglecting the correlations among the multi-modal data and do not fit in the real-world scenarios well. In response, in this work, we establish a novel yet practical MMEA dataset, i.e. NMMEA, which models multi-modal data (e.g., images) equally as textual entities in the MMKG. Due to the introduction of multi-modal data, NMMEA poses new challenges to existing MMEA solutions, i.e., heterogeneous structural representation learning and cross-modal alignment inference. Hence, we put forward a simple yet effective solution, CrossEA, which can effectively learn the structural information of entities by considering both intra-modal and cross-modal relations, and further infer the similarity of different types of entity pairs. Extensive experiments validate the significance of NMMEA, where CrossEA can achieve superior performance in contrast to competitive methods on the proposed dataset.
Anthology ID:
2025.coling-main.582
Volume:
Proceedings of the 31st International Conference on Computational Linguistics
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8714–8724
Language:
URL:
https://aclanthology.org/2025.coling-main.582/
DOI:
Bibkey:
Cite (ACL):
GuanChen Xiao, WeiXin Zeng, ShiQi Zhang, MingRui Lao, and Xiang Zhao. 2025. Multi-Modal Entities Matter: Benchmarking Multi-Modal Entity Alignment. In Proceedings of the 31st International Conference on Computational Linguistics, pages 8714–8724, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Multi-Modal Entities Matter: Benchmarking Multi-Modal Entity Alignment (Xiao et al., COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-main.582.pdf