Jiarong Tang


2025

This paper presents our approach for SemEval-2025 Task 1, Advancing Multimodal Idiomaticity Representation (AdMIRe), which focuses on idiom image ranking via semantic similarity. We explore multiple strategies, including neural networks on extracted embeddings and Siamese networks with triplet loss. A key component of our methodology is the application of advanced prompt engineeringtechniques within multimodal in-context learning (ManyICL), leveraging GPT-4o, CLIP.Our experiments demonstrate that structured and optimized prompts significantly enhancethe model’s ability to interpret idiomatic expressions in a multimodal setting.