MemeReaCon: Probing Contextual Meme Understanding in Large Vision-Language Models

Zhengyi Zhao; Shubo Zhang; Yuxi Zhang; Yanxi Zhao; Yifan Zhang; Zezhong Wang; Huimin Wang; Yutian Zhao; Bin Liang (梁斌); Yefeng Zheng; Binyang Li; Kam-Fai Wong; Xian Wu

doi:10.18653/v1/2025.emnlp-main.176

MemeReaCon: Probing Contextual Meme Understanding in Large Vision-Language Models

Zhengyi Zhao, Shubo Zhang, Yuxi Zhang, Yanxi Zhao, Yifan Zhang, Zezhong Wang, Huimin Wang, Yutian Zhao, Bin Liang, Yefeng Zheng, Binyang Li, Kam-Fai Wong, Xian Wu

Abstract

Memes have emerged as a popular form of multimodal online communication, where their interpretation heavily depends on the specific context in which they appear. Current approaches predominantly focus on isolated meme analysis, either for harmful content detection or standalone interpretation, overlooking a fundamental challenge: the same meme can express different intents depending on its conversational context. This oversight creates an evaluation gap: although humans intuitively recognize how context shapes meme interpretation, Large Vision Language Models (LVLMs) can hardly understand context-dependent meme intent. To address this critical limitation, we introduce MemeReaCon, a novel benchmark specifically designed to evaluate how LVLMs understand memes in their original context. We collected memes from five different Reddit communities, keeping each meme’s image, the post text, and user comments together. We carefully labeled how the text and meme work together, what the poster intended, how the meme is structured, and how the community responded. Our tests with leading LVLMs show a clear weakness: models either fail to interpret critical information in the contexts, or overly focus on visual details while overlooking communicative purpose. MemeReaCon thus serves both as a diagnostic tool exposing current limitations and as a challenging benchmark to drive development toward more sophisticated LVLMs of the context-aware understanding.

Anthology ID:: 2025.emnlp-main.176
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3559–3582
Language:
URL:: https://aclanthology.org/2025.emnlp-main.176/
DOI:: 10.18653/v1/2025.emnlp-main.176
Bibkey:
Cite (ACL):: Zhengyi Zhao, Shubo Zhang, Yuxi Zhang, Yanxi Zhao, Yifan Zhang, Zezhong Wang, Huimin Wang, Yutian Zhao, Bin Liang, Yefeng Zheng, Binyang Li, Kam-Fai Wong, and Xian Wu. 2025. MemeReaCon: Probing Contextual Meme Understanding in Large Vision-Language Models. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 3559–3582, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: MemeReaCon: Probing Contextual Meme Understanding in Large Vision-Language Models (Zhao et al., EMNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.emnlp-main.176.pdf
Checklist:: 2025.emnlp-main.176.checklist.pdf

PDF Cite Search Checklist Fix data