Truth as a Trajectory: What Internal Representations Reveal About Large Language Model Reasoning

Hamed Damirchi; Imezadelajara; Ehsan Abbasnejad; Afshar Shamsi; Zhen Zhang; Javen Qinfeng Shi

Truth as a Trajectory: What Internal Representations Reveal About Large Language Model Reasoning

Hamed Damirchi, Imezadelajara, Ehsan Abbasnejad, Afshar Shamsi, Zhen Zhang, Javen Qinfeng Shi

Abstract

Existing explainability methods for Large Language Models (LLMs) typically treat hidden states as static points in activation space, assuming that correct and incorrect inferences can be separated using representations from an individual layer. However, these activations are saturated with polysemantic features, leading to linear probes learning surface-level lexical patterns rather than underlying reasoning structures. We introduce Truth as a Trajectory (TaT), which models the transformer inference as an unfolded trajectory of iterative refinements, shifting analysis from static activations to layer-wise geometric displacement. By analyzing displacement of representations across layers, TaT captures structural patterns in the evolution of inference that distinguish valid reasoning from spurious behavior. We evaluate TaT across dense and Mixture-of-Experts (MoE) architectures on benchmarks spanning commonsense reasoning, question answering, and toxicity detection. Without access to the activations themselves and using only changes in activations across layers, we show that TaT effectively mitigates reliance on static lexical confounds, outperforming conventional probing, and establishes trajectory analysis as a complementary perspective on LLM explainability.

Anthology ID:: 2026.acl-long.2073
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 44774–44790
Language:
URL:: https://aclanthology.org/2026.acl-long.2073/
DOI:
Bibkey:
Cite (ACL):: Hamed Damirchi, Imezadelajara, Ehsan Abbasnejad, Afshar Shamsi, Zhen Zhang, and Javen Qinfeng Shi. 2026. Truth as a Trajectory: What Internal Representations Reveal About Large Language Model Reasoning. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 44774–44790, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Truth as a Trajectory: What Internal Representations Reveal About Large Language Model Reasoning (Damirchi et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-long.2073.pdf
Checklist:: 2026.acl-long.2073.checklist.pdf

PDF Cite Search Checklist Fix data