Erica Salinas

2024

Automating the measurement of hallucinations in LLM generated responses is a challenging task as it requires careful investigation of each factual claim in a response. In this paper, we introduce HalluMeasure, a new LLM-based hallucination detection mechanism that decomposes an LLM response into atomic claims, and evaluates each atomic claim against the provided reference context. The model uses a step-by-step reasoning process called Chain-of-Thought and can identify 3 major categories of hallucinations (e.g., contradiction) as well as 10 more specific subtypes (e.g., overgeneralization) which help to identify reasons behind the hallucination errors. Specifically, we explore four different configurations for HalluMeasure’s classifier: with and without CoT prompting, and using a single classifier call to classify all claims versus separate calls for each claim. The best-performing configuration (with CoT and separate calls for each claim) demonstrates significant improvements in detecting hallucinations, achieving a 10-point increase in F1 score on our TechNewsSumm dataset, and a 3-point increase in AUC ROC on the SummEval dataset, compared to three baseline models (RefChecker, AlignScore, and Vectara HHEM). We further show reasonable accuracy on detecting 10 novel error subtypes of hallucinations (where even humans struggle in classification) derived from linguistic analysis of the errors made by the LLMs.

Co-authors

Erwin Cornejo 1

Venues

emnlp1