What Are We Measuring When We Evaluate Large Vision-Language Models? An Analysis of Latent Factors and Biases

What Are We Measuring When We Evaluate Large Vision-Language Models? An Analysis of Latent Factors and Biases Anthony Tiong author Junqi Zhao author Boyang Li author Junnan Li author Steven Hoi author Caiming Xiong author 2024-06 text Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) Kevin Duh editor Helena Gomez editor Steven Bethard editor Association for Computational Linguistics Mexico City, Mexico conference publication tiong-etal-2024-measuring 10.18653/v1/2024.naacl-long.188 https://aclanthology.org/2024.naacl-long.188/ 2024-06 3427 3454