iai_MSU at SemEval-2025 Task-3: Mu-SHROOM, the Multilingual Shared-task on Hallucinations and Related Observable Overgeneration Mistakes in English

Mikhail Pukemo; Aleksandr Levykin; Dmitrii Melikhov; Gleb Skiba; Roman Ischenko; Konstantin Vorontsov

iai_MSU at SemEval-2025 Task-3: Mu-SHROOM, the Multilingual Shared-task on Hallucinations and Related Observable Overgeneration Mistakes in English

Mikhail Pukemo, Aleksandr Levykin, Dmitrii Melikhov, Gleb Skiba, Roman Ischenko, Konstantin Vorontsov

Abstract

This paper presents the submissions of the iai_MSU team for SemEval-2025 Task 3 – Mu-SHROOM, where we achieved first place in the English language. The task involves detecting hallucinations in model-generated text, which requires systems to verify claims against reliable sources.In this paper, we present our approach to hallucination detection, which employs a three-stage system. The first stage uses a retrieval-based (Lewis et al., 2021) to verify claims against external knowledge sources. The second stage applies the Self-Refine Prompting (Madaan et al., 2023) to improve detection accuracy by analyzing potential errors of the first stage. The third stage combines predictions from the first and second stages into an ensemble.Our system achieves state-of-the-art performance on the competition dataset, demonstrating the effectiveness of combining retrieval-augmented verification with Self-Refine Prompting. The code for the solutions is available on https://github.com/pansershrek/IAI_MSU.

Anthology ID:: 2025.semeval-1.28
Volume:: Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
Venues:: SemEval | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 193–197
Language:
URL:: https://aclanthology.org/2025.semeval-1.28/
DOI:
Bibkey:
Cite (ACL):: Mikhail Pukemo, Aleksandr Levykin, Dmitrii Melikhov, Gleb Skiba, Roman Ischenko, and Konstantin Vorontsov. 2025. iai_MSU at SemEval-2025 Task-3: Mu-SHROOM, the Multilingual Shared-task on Hallucinations and Related Observable Overgeneration Mistakes in English. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 193–197, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: iai_MSU at SemEval-2025 Task-3: Mu-SHROOM, the Multilingual Shared-task on Hallucinations and Related Observable Overgeneration Mistakes in English (Pukemo et al., SemEval 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.semeval-1.28.pdf

PDF Cite Search Fix data