Uncertainty Quantification for Large Language Models

Artem Shelmanov; Maxim Panov; Roman Vashurin; Artem Vazhentsev; Ekaterina Fadeeva; Timothy Baldwin

doi:10.18653/v1/2025.acl-tutorials.3

Uncertainty Quantification for Large Language Models

Artem Shelmanov, Maxim Panov, Roman Vashurin, Artem Vazhentsev, Ekaterina Fadeeva, Timothy Baldwin

Abstract

Large language models (LLMs) are widely used in NLP applications, but their tendency to produce hallucinations poses significant challenges to the reliability and safety, ultimately undermining user trust. This tutorial offers the first systematic introduction to uncertainty quantification (UQ) for LLMs in text generation tasks – a conceptual and methodological framework that provides tools for communicating the reliability of a model answer. This additional output could be leveraged for a range of downstream tasks, including hallucination detection and selective generation. We begin with the theoretical foundations of uncertainty, highlighting why techniques developed for classification might fall short in text generation. Building on this grounding, we survey state-of-the-art white-box and black-box UQ methods, from simple entropy-based scores to supervised probes over hidden states and attention weights, and show how they enable selective generation and hallucination detection. Additionally, we discuss the calibration of uncertainty scores for better interpretability. A key feature of the tutorial is practical examples using LM-Polygraph, an open-source framework that unifies more than a dozen recent UQ and calibration algorithms and provides a large-scale benchmark, allowing participants to implement UQ in their applications, as well as reproduce and extend experimental results with only a few lines of code. By the end of the session, researchers and practitioners will be equipped to (i) evaluate and compare existing UQ techniques, (ii) develop new methods, and (iii) implement UQ in their code for deploying safer, more trustworthy LLM-based systems.

Anthology ID:: 2025.acl-tutorials.3
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 5: Tutorial Abstracts)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Yuki Arase, David Jurgens, Fei Xia
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3–4
Language:
URL:: https://aclanthology.org/2025.acl-tutorials.3/
DOI:: 10.18653/v1/2025.acl-tutorials.3
Bibkey:
Cite (ACL):: Artem Shelmanov, Maxim Panov, Roman Vashurin, Artem Vazhentsev, Ekaterina Fadeeva, and Timothy Baldwin. 2025. Uncertainty Quantification for Large Language Models. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 5: Tutorial Abstracts), pages 3–4, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Uncertainty Quantification for Large Language Models (Shelmanov et al., ACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.acl-tutorials.3.pdf

PDF Cite Search Fix data