Subhanandh Tamilarasu
2023
teamPN at SemEval-2023 Task 1: Visual Word Sense Disambiguation Using Zero-Shot MultiModal Approach
Nikita Katyal
|
Pawan Rajpoot
|
Subhanandh Tamilarasu
|
Joy Mustafi
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
Visual Word Sense Disambiguation shared task at SemEval-2023 aims to identify an image corresponding to the intended meaning of a given ambiguous word (with related context) from a set of candidate images. The lack of textual description for the candidate image and the corresponding word’s ambiguity makes it a challenging problem. This paper describes teamPN’s multi-modal and modular approach to solving this in English track of the task. We efficiently used recent multi-modal pre-trained models backed by real-time multi-modal knowledge graphs to augment textual knowledge for the images and select the best matching image accordingly. We outperformed the baseline model by ~5 points and proposed a unique approach that can further work as a framework for other modular and knowledge-backed solutions.
Search