Chain-of-Verification Reduces Hallucination in Large Language Models

Shehzaad Dhuliawala; Mojtaba Komeili; Jing Xu; Roberta Raileanu; Xian Li; Asli Celikyilmaz; Jason Weston

Chain-of-Verification Reduces Hallucination in Large Language Models

Shehzaad Dhuliawala, Mojtaba Komeili, Jing Xu, Roberta Raileanu, Xian Li, Asli Celikyilmaz, Jason Weston

Abstract

Generation of plausible yet incorrect factual information, termed hallucination, is an unsolved issue in large language models. We study the ability of language models to deliberate on the responses they give in order to correct their mistakes. We develop the Chain-of-Verification (CoVe) method whereby the model first (i) drafts an initial response; then (ii) plans verification questions to fact-check its draft; (iii) answers those questions independently so the answers are not biased by other responses; and (iv) generates its final verified response. In experiments, we show CoVe decreases hallucinations across a variety of tasks, from list-based questions from Wikidata, closed book MultiSpanQA and longform text generation.

Anthology ID:: 2024.findings-acl.212
Volume:: Findings of the Association for Computational Linguistics ACL 2024
Month:: August
Year:: 2024
Address:: Bangkok, Thailand and virtual meeting
Editors:: Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3563–3578
Language:
URL:: https://aclanthology.org/2024.findings-acl.212
DOI:
Bibkey:
Cite (ACL):: Shehzaad Dhuliawala, Mojtaba Komeili, Jing Xu, Roberta Raileanu, Xian Li, Asli Celikyilmaz, and Jason Weston. 2024. Chain-of-Verification Reduces Hallucination in Large Language Models. In Findings of the Association for Computational Linguistics ACL 2024, pages 3563–3578, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
Cite (Informal):: Chain-of-Verification Reduces Hallucination in Large Language Models (Dhuliawala et al., Findings 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.findings-acl.212.pdf

PDF Cite Search