Deceptive Semantic Shortcuts on Reasoning Chains: How Far Can Models Go without Hallucination?

Bangzheng Li, Ben Zhou, Fei Wang, Xingyu Fu, Dan Roth, Muhao Chen


Abstract
Despite the high performances of large language models (LLMs) across numerous benchmarks, recent research has unveiled their suffering from hallucinations and unfaithful reasoning. This work studies a type of hallucination induced by semantic associations. We investigate to what extent LLMs take shortcuts from certain keyword/entity biases in the prompt instead of following correct reasoning paths. To quantify this phenomenon, we propose a novel probing method and benchmark called EUREQA. EUREQA is an entity-searching task where a model finds a missing entity based on described multi-hop relations with other entities. These deliberately designed multi-hop relations create deceptive semantic associations, and models must stick to the correct reasoning path instead of incorrect shortcuts to find the correct answer.Experiments show that existing LLMs cannot follow correct reasoning paths and resist the attempt of greedy shortcuts, with GPT-4 only achieving 62% accuracy. Analyses provide further evidence that LLMs rely on semantic biases to solve the task instead of proper reasoning, questioning the validity and generalizability of current LLMs’ high performances.
Anthology ID:
2024.naacl-long.424
Volume:
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Kevin Duh, Helena Gomez, Steven Bethard
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7668–7681
Language:
URL:
https://aclanthology.org/2024.naacl-long.424
DOI:
Bibkey:
Cite (ACL):
Bangzheng Li, Ben Zhou, Fei Wang, Xingyu Fu, Dan Roth, and Muhao Chen. 2024. Deceptive Semantic Shortcuts on Reasoning Chains: How Far Can Models Go without Hallucination?. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 7668–7681, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
Deceptive Semantic Shortcuts on Reasoning Chains: How Far Can Models Go without Hallucination? (Li et al., NAACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.naacl-long.424.pdf
Copyright:
 2024.naacl-long.424.copyright.pdf