Pareto Optimal Learning for Estimating Large Language Model Errors

Theodore Zhao, Mu Wei, J. Preston, Hoifung Poon


Abstract
Large Language Models (LLMs) have shown impressive abilities in many applications. When a concrete and precise answer is desired, it is important to have a quantitative estimation of the potential error rate. However, this can be challenging due to the text-in-text-out nature of the generative models. We present a method based on Pareto optimization that generates a risk score to estimate the probability of error in an LLM response by integrating multiple sources of information. We prove theoretically that the error estimator optimized in our framework aligns with the LLM and the information sources in an Pareto optimal manner. Experimental results show that the risk scores estimated by our method are well correlated with the true LLM error rate, thus facilitating error correction. By dynamically combining with prompting strategies such as self-verification and information retrieval, we demonstrate the proposed method can be utilized to increase the performance of an LLM, surpassing state-of-the-art task specific model.
Anthology ID:
2024.acl-long.566
Volume:
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10513–10529
Language:
URL:
https://aclanthology.org/2024.acl-long.566
DOI:
Bibkey:
Cite (ACL):
Theodore Zhao, Mu Wei, J. Preston, and Hoifung Poon. 2024. Pareto Optimal Learning for Estimating Large Language Model Errors. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10513–10529, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Pareto Optimal Learning for Estimating Large Language Model Errors (Zhao et al., ACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.acl-long.566.pdf