SPUQ: Perturbation-Based Uncertainty Quantification for Large Language Models

Xiang Gao; Jiaxin Zhang; Lalla Mouatadid; Kamalika Das

SPUQ: Perturbation-Based Uncertainty Quantification for Large Language Models

Xiang Gao, Jiaxin Zhang, Lalla Mouatadid, Kamalika Das

Abstract

In recent years, large language models (LLMs) have become increasingly prevalent, offering remarkable text generation capabilities. However, a pressing challenge is their tendency to make confidently wrong predictions, highlighting the critical need for uncertainty quantification (UQ) in LLMs. While previous works have mainly focused on addressing aleatoric uncertainty, the full spectrum of uncertainties, including epistemic, remains inadequately explored. Motivated by this gap, we introduce a novel UQ method, sampling with perturbation for UQ (SPUQ), designed to tackle both aleatoric and epistemic uncertainties. The method entails generating a set of perturbations for LLM inputs, sampling outputs for each perturbation, and incorporating an aggregation module that generalizes the sampling uncertainty approach for text generation tasks. Through extensive experiments on various datasets, we investigated different perturbation and aggregation techniques. Our findings show a substantial improvement in model uncertainty calibration, with a reduction in Expected Calibration Error (ECE) by 50% on average. Our findings suggest that our proposed UQ method offers promising steps toward enhancing the reliability and trustworthiness of LLMs.

Anthology ID:: 2024.eacl-long.143
Volume:: Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: March
Year:: 2024
Address:: St. Julian’s, Malta
Editors:: Yvette Graham, Matthew Purver
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2336–2346
Language:
URL:: https://aclanthology.org/2024.eacl-long.143
DOI:
Bibkey:
Cite (ACL):: Xiang Gao, Jiaxin Zhang, Lalla Mouatadid, and Kamalika Das. 2024. SPUQ: Perturbation-Based Uncertainty Quantification for Large Language Models. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2336–2346, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):: SPUQ: Perturbation-Based Uncertainty Quantification for Large Language Models (Gao et al., EACL 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.eacl-long.143.pdf

PDF Cite Search