Muffin or Chihuahua? Challenging Multimodal Large Language Models with Multipanel VQA Yue Fan author Jing Gu author Kaiwen Zhou author Qianqi Yan author Shan Jiang author Ching-Chen Kuo author Yang Zhao author Xinze Guan author Xin Wang author 2024-08 text Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Lun-Wei Ku editor Andre Martins editor Vivek Srikumar editor Association for Computational Linguistics Bangkok, Thailand conference publication fan-etal-2024-muffin 10.18653/v1/2024.acl-long.370 https://aclanthology.org/2024.acl-long.370/ 2024-08 6845 6863