VerbCLIP: Improving Verb Understanding in Vision-Language Models with Compositional Structures

VerbCLIP: Improving Verb Understanding in Vision-Language Models with Compositional Structures Hadi Wazni author Kin Ian Lo author Mehrnoosh Sadrzadeh author 2024-08 text Proceedings of the 3rd Workshop on Advances in Language and Vision Research (ALVR) Jing Gu editor Tsu-Jui (Ray) Fu editor Drew Hudson editor Asli Celikyilmaz editor William Wang editor Association for Computational Linguistics Bangkok, Thailand conference publication wazni-etal-2024-verbclip 10.18653/v1/2024.alvr-1.17 https://aclanthology.org/2024.alvr-1.17/ 2024-08 195 201