Radu - Gabriel Chivereanu


2025

The paper details our approach to SemEval 2025 Shared Task 7: Multilingual and Crosslingual Fact-Checked Claim Retrieval.We investigate how large language models (LLMs) designed for general-purpose retrieval via text-embeddings can be adapted for fact-checked claim retrieval across multiple languages, including scenarios where the query and fact-check are in different languages. The experiments involve fine-tuning with a contrastive objective, resulting in notable gains in both accuracy and efficiency over the baseline retrieval model. We evaluate cost-effective techniques such as LoRA and QLoRA and Prompt Tuning.Additionally, we demonstrate the benefits of Matryoshka embeddings in minimizing the memory footprint of stored embeddings, reducing the system requirements for a fact-checking system.