teamPN at SemEval-2023 Task 1: Visual Word Sense Disambiguation Using Zero-Shot MultiModal Approach

Nikita Katyal, Pawan Rajpoot, Subhanandh Tamilarasu, Joy Mustafi


Abstract
Visual Word Sense Disambiguation shared task at SemEval-2023 aims to identify an image corresponding to the intended meaning of a given ambiguous word (with related context) from a set of candidate images. The lack of textual description for the candidate image and the corresponding word’s ambiguity makes it a challenging problem. This paper describes teamPN’s multi-modal and modular approach to solving this in English track of the task. We efficiently used recent multi-modal pre-trained models backed by real-time multi-modal knowledge graphs to augment textual knowledge for the images and select the best matching image accordingly. We outperformed the baseline model by ~5 points and proposed a unique approach that can further work as a framework for other modular and knowledge-backed solutions.
Anthology ID:
2023.semeval-1.63
Volume:
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Atul Kr. Ojha, A. Seza Doğruöz, Giovanni Da San Martino, Harish Tayyar Madabushi, Ritesh Kumar, Elisa Sartori
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
457–461
Language:
URL:
https://aclanthology.org/2023.semeval-1.63
DOI:
10.18653/v1/2023.semeval-1.63
Bibkey:
Cite (ACL):
Nikita Katyal, Pawan Rajpoot, Subhanandh Tamilarasu, and Joy Mustafi. 2023. teamPN at SemEval-2023 Task 1: Visual Word Sense Disambiguation Using Zero-Shot MultiModal Approach. In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), pages 457–461, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
teamPN at SemEval-2023 Task 1: Visual Word Sense Disambiguation Using Zero-Shot MultiModal Approach (Katyal et al., SemEval 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.semeval-1.63.pdf