A Head to Predict and a Head to Question: Pre-trained Uncertainty Quantification Heads for Hallucination Detection in LLM Outputs

Artem Shelmanov; Ekaterina Fadeeva; Akim Tsvigun; Ivan Tsvigun; Zhuohan Xie; Igor Kiselev; Nico Daheim; Caiqi Zhang; Artem Vazhentsev; Mrinmaya Sachan; Preslav Nakov; Timothy Baldwin

doi:10.18653/v1/2025.emnlp-main.1809

A Head to Predict and a Head to Question: Pre-trained Uncertainty Quantification Heads for Hallucination Detection in LLM Outputs

Artem Shelmanov, Ekaterina Fadeeva, Akim Tsvigun, Ivan Tsvigun, Zhuohan Xie, Igor Kiselev, Nico Daheim, Caiqi Zhang, Artem Vazhentsev, Mrinmaya Sachan, Preslav Nakov, Timothy Baldwin

Abstract

LLMs have the tendency to hallucinate, i.e., to sporadically generate false or fabricated information, and users generally lack the tools to detect when this happens. Uncertainty quantification (UQ) provides a framework for assessing the reliability of model outputs, aiding in the identification of potential hallucinations. In this work, we introduce pre-trained UQ heads: supervised auxiliary modules for LLMs that substantially enhance their ability to capture uncertainty compared to unsupervised UQ methods. Their strong performance stems from the transformer architecture in their design, in the form of informative features derived from LLM attention maps and logits. Our experiments show that these heads are highly robust and achieve state-of-the-art performance in claim-level hallucination detection across both in-domain and out-of-domain prompts. Moreover, these modules demonstrate strong generalization to languages they were not explicitly trained on. We pre-train a collection of UQ heads for popular LLM series, including Mistral, Llama, and Gemma. We publicly release both the code and the pre-trained heads.

Anthology ID:: 2025.emnlp-main.1809
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 35712–35731
Language:
URL:: https://aclanthology.org/2025.emnlp-main.1809/
DOI:: 10.18653/v1/2025.emnlp-main.1809
Bibkey:
Cite (ACL):: Artem Shelmanov, Ekaterina Fadeeva, Akim Tsvigun, Ivan Tsvigun, Zhuohan Xie, Igor Kiselev, Nico Daheim, Caiqi Zhang, Artem Vazhentsev, Mrinmaya Sachan, Preslav Nakov, and Timothy Baldwin. 2025. A Head to Predict and a Head to Question: Pre-trained Uncertainty Quantification Heads for Hallucination Detection in LLM Outputs. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 35712–35731, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: A Head to Predict and a Head to Question: Pre-trained Uncertainty Quantification Heads for Hallucination Detection in LLM Outputs (Shelmanov et al., EMNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.emnlp-main.1809.pdf
Checklist:: 2025.emnlp-main.1809.checklist.pdf

PDF Cite Search Checklist Fix data