PolCLIP: A Unified Image-Text Word Sense Disambiguation Model via Generating Multimodal Complementary Representations

PolCLIP: A Unified Image-Text Word Sense Disambiguation Model via Generating Multimodal Complementary Representations Qihao Yang author Yong Li author Xuelin Wang author Fu Lee Wang author Tianyong Hao author 2024-08 text Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Lun-Wei Ku editor Andre Martins editor Vivek Srikumar editor Association for Computational Linguistics Bangkok, Thailand conference publication yang-etal-2024-polclip 10.18653/v1/2024.acl-long.575 https://aclanthology.org/2024.acl-long.575/ 2024-08 10676 10690