Prompt-Guided Internal States for Hallucination Detection of Large Language Models

Fujie Zhang; Peiqi Yu; Biao Yi; Baolei Zhang; Tong Li; Zheli Liu

doi:10.18653/v1/2025.acl-long.1058

Prompt-Guided Internal States for Hallucination Detection of Large Language Models

Fujie Zhang, Peiqi Yu, Biao Yi, Baolei Zhang, Tong Li, Zheli Liu

Abstract

Large Language Models (LLMs) have demonstrated remarkable capabilities across a variety of tasks in different domains. However, they sometimes generate responses that are logically coherent but factually incorrect or misleading, which is known as LLM hallucinations. Data-driven supervised methods train hallucination detectors by leveraging the internal states of LLMs, but detectors trained on specific domains often struggle to generalize well to other domains. In this paper, we aim to enhance the cross-domain performance of supervised detectors with only in-domain data. We propose a novel framework, prompt-guided internal states for hallucination detection of LLMs, namely PRISM. By utilizing appropriate prompts to guide changes to the structure related to text truthfulness in LLMs’ internal states, we make this structure more salient and consistent across texts from different domains. We integrated our framework with existing hallucination detection methods and conducted experiments on datasets from different domains. The experimental results indicate that our framework significantly enhances the cross-domain generalization of existing hallucination detection methods.

Anthology ID:: 2025.acl-long.1058
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 21806–21818
Language:
URL:: https://aclanthology.org/2025.acl-long.1058/
DOI:: 10.18653/v1/2025.acl-long.1058
Bibkey:
Cite (ACL):: Fujie Zhang, Peiqi Yu, Biao Yi, Baolei Zhang, Tong Li, and Zheli Liu. 2025. Prompt-Guided Internal States for Hallucination Detection of Large Language Models. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 21806–21818, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Prompt-Guided Internal States for Hallucination Detection of Large Language Models (Zhang et al., ACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.acl-long.1058.pdf

PDF Cite Search Fix data