Contextualized Sequence Likelihood: Enhanced Confidence Scores for Natural Language Generation

Zhen Lin, Shubhendu Trivedi, Jimeng Sun


Abstract
The advent of large language models (LLMs) has dramatically advanced the state-of-the-art in numerous natural language generation tasks. For LLMs to be applied reliably, it is essential to have an accurate measure of their confidence. Currently, the most commonly used confidence score function is the likelihood of the generated sequence, which, however, conflates semantic and syntactic components. For instance, in question-answering (QA) tasks, an awkward phrasing of the correct answer might result in a lower probability prediction. Additionally, different tokens should be weighted differently depending on the context. In this work, we propose enhancing the predicted sequence probability by assigning different weights to various tokens using attention values elicited from the base LLM. By employing a validation set, we can identify the relevant attention heads, thereby significantly improving the reliability of the vanilla sequence probability confidence measure. We refer to this new score as the Contextualized Sequence Likelihood (CSL). CSL is easy to implement, fast to compute, and offers considerable potential for further improvement with task-specific prompts. Across several QA datasets and a diverse array of LLMs, CSL has demonstrated significantly higher reliability than state-of-the-art baselines in predicting generation quality, as measured by the AUROC or AUARC.
Anthology ID:
2024.emnlp-main.578
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10351–10368
Language:
URL:
https://aclanthology.org/2024.emnlp-main.578
DOI:
Bibkey:
Cite (ACL):
Zhen Lin, Shubhendu Trivedi, and Jimeng Sun. 2024. Contextualized Sequence Likelihood: Enhanced Confidence Scores for Natural Language Generation. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 10351–10368, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Contextualized Sequence Likelihood: Enhanced Confidence Scores for Natural Language Generation (Lin et al., EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.578.pdf
Software:
 2024.emnlp-main.578.software.zip