Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences

Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences Xiyao Wang author Yuhang Zhou author Xiaoyu Liu author Hongjin Lu author Yuancheng Xu author Feihong He author Jaehong Yoon author Taixi Lu author Fuxiao Liu author Gedas Bertasius author Mohit Bansal author Huaxiu Yao author Furong Huang author 2024-08 text Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Lun-Wei Ku editor Andre Martins editor Vivek Srikumar editor Association for Computational Linguistics Bangkok, Thailand conference publication wang-etal-2024-mementos 10.18653/v1/2024.acl-long.25 https://aclanthology.org/2024.acl-long.25/ 2024-08 416 442