FENJI at SemEval-2025 Task 3: Retrieval-Augmented Generation and Hallucination Span Detection

Flor Alberts; Ivo Bruinier; Nathalie Palm; Justin Paetzelt; Erik Varecha

FENJI at SemEval-2025 Task 3: Retrieval-Augmented Generation and Hallucination Span Detection

Flor Alberts, Ivo Bruinier, Nathalie Palm, Justin Paetzelt, Erik Varecha

Abstract

Large Language Models (LLMs) have significantly advanced Natural Language Processing, however, ensuring the factual reliability of these models remains a challenge, as they are prone to hallucination - generating text that appears coherent but contains innacurate or unsupported information. SemEval-2025 Mu-SHROOM focused on character-level hallucination detection in 14 languages. In this task, participants were required to pinpoint hallucinated spans in text generated by multiple instruction-tuned LLMs. Our team created a system that leveraged a Retrieval-Augmented Generation (RAG) approach and prompting a FLAN-T5 model to identify hallucination spans. Despite contradicting prior literature, our approach yielded disappointing results, underperforming all the “mark-all” baselines and failing to achieve competitive scores. Notably, removing RAG improved performance. The findings highlight that while RAG holds potential for hallucination detection, its effectiveness is heavily influenced by the retrieval component’s context-awareness. Enhancing the RAG’s ability to capture more comprehensive contextual information could improve performance across languages, making it a more reliable tool for identifying hallucination spans.

Anthology ID:: 2025.semeval-1.151
Volume:: Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
Venues:: SemEval | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1143–1151
Language:
URL:: https://aclanthology.org/2025.semeval-1.151/
DOI:
Bibkey:
Cite (ACL):: Flor Alberts, Ivo Bruinier, Nathalie Palm, Justin Paetzelt, and Erik Varecha. 2025. FENJI at SemEval-2025 Task 3: Retrieval-Augmented Generation and Hallucination Span Detection. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 1143–1151, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: FENJI at SemEval-2025 Task 3: Retrieval-Augmented Generation and Hallucination Span Detection (Alberts et al., SemEval 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.semeval-1.151.pdf

PDF Cite Search Fix data