Martin Semmann
2024
UHH at AVeriTeC: RAG for Fact-Checking with Real-World Claims
Özge Sevgili
|
Irina Nikishina
|
Seid Muhie Yimam
|
Martin Semmann
|
Chris Biemann
Proceedings of the Seventh Fact Extraction and VERification Workshop (FEVER)
This paper presents UHH’s approach developed for the AVeriTeC shared task. The goal of the challenge is to verify given real-world claims with evidences from the Web. In this shared task, we investigate a Retrieval-Augmented Generation (RAG) model, which mainly contains retrieval, generation, and augmentation components. We start with the selection of the top 10k evidences via BM25 scores, and continue with two approaches to retrieve the most similar evidences: (1) to retrieve top 10 evidences through vector similarity, generate questions for them, and rerank them or (2) to generate questions for the claim and retrieve the most similar evidence, again, through vector similarity. After retrieving the top evidences, a Large Language Model (LLM) is prompted using the claim along with either all evidences or individual evidence to predict the label. Our system submission, UHH, using the first approach and individual evidence prompts, ranks 6th out of 23 systems.