A Glitch in the Matrix? Locating and Detecting Language Model Grounding with Fakepedia

Giovanni Monea; Maxime Peyrard; Martin Josifoski; Vishrav Chaudhary; Jason Eisner; Emre Kiciman; Hamid Palangi; Barun Patra; Robert West

doi:10.18653/v1/2024.acl-long.369

A Glitch in the Matrix? Locating and Detecting Language Model Grounding with Fakepedia

Giovanni Monea, Maxime Peyrard, Martin Josifoski, Vishrav Chaudhary, Jason Eisner, Emre Kiciman, Hamid Palangi, Barun Patra, Robert West

Abstract

Large language models (LLMs) have an impressive ability to draw on novel information supplied in their context. Yet the mechanisms underlying this contextual grounding remain unknown, especially in situations where contextual information contradicts factual knowledge stored in the parameters, which LLMs also excel at recalling. Favoring the contextual information is critical for retrieval-augmented generation methods, which enrich the context with up-to-date information, hoping that grounding can rectify outdated or noisy stored knowledge. We present a novel method to study grounding abilities using Fakepedia, a novel dataset of counterfactual texts constructed to clash with a model’s internal parametric knowledge. In this study, we introduce Fakepedia, a counterfactual dataset designed to evaluate grounding abilities when the internal parametric knowledge clashes with the contextual information. We benchmark various LLMs with Fakepedia and conduct a causal mediation analysis of LLM components when answering Fakepedia queries, based on our Masked Grouped Causal Tracing (MGCT) method. Through this analysis, we identify distinct computational patterns between grounded and ungrounded responses. We finally demonstrate that distinguishing grounded from ungrounded responses is achievable through computational analysis alone. Our results, together with existing findings about factual recall mechanisms, provide a coherent narrative of how grounding and factual recall mechanisms interact within LLMs.

Anthology ID:: 2024.acl-long.369
Volume:: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: August
Year:: 2024
Address:: Bangkok, Thailand
Editors:: Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 6828–6844
Language:
URL:: https://aclanthology.org/2024.acl-long.369/
DOI:: 10.18653/v1/2024.acl-long.369
Bibkey:
Cite (ACL):: Giovanni Monea, Maxime Peyrard, Martin Josifoski, Vishrav Chaudhary, Jason Eisner, Emre Kiciman, Hamid Palangi, Barun Patra, and Robert West. 2024. A Glitch in the Matrix? Locating and Detecting Language Model Grounding with Fakepedia. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6828–6844, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):: A Glitch in the Matrix? Locating and Detecting Language Model Grounding with Fakepedia (Monea et al., ACL 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.acl-long.369.pdf

PDF Cite Search Fix data