Cong Huang
2023
ECNU_MIV at SemEval-2023 Task 1: CTIM - Contrastive Text-Image Model for Multilingual Visual Word Sense Disambiguation
Zhenghui Li
|
Qi Zhang
|
Xueyin Xia
|
Yinxiang Ye
|
Qi Zhang
|
Cong Huang
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
Our team focuses on the multimodal domain of images and texts, we propose a model that can learn the matching relationship between text-image pairs by contrastive learning. More specifically, We train the model from the labeled data provided by the official organizer, after pre-training, texts are used to reference learned visual concepts enabling visual word sense disambiguation tasks. In addition, the top results our teams get have been released showing the effectiveness of our solution.
Search