Time Course MechInterp: Analyzing the Evolution of Components and Knowledge in Large Language Models

Ahmad Dawar Hakimi; Ali Modarressi; Philipp Wicke; Hinrich Schütze

doi:10.18653/v1/2025.findings-acl.654

Time Course MechInterp: Analyzing the Evolution of Components and Knowledge in Large Language Models

Ahmad Dawar Hakimi, Ali Modarressi, Philipp Wicke, Hinrich Schuetze

Abstract

Understanding how large language models (LLMs) acquire and store factual knowledge is crucial for enhancing their interpretability, reliability, and efficiency. In this work, we analyze the evolution of factual knowledge representation in the OLMo-7B model by tracking the roles of its Attention Heads and Feed Forward Networks (FFNs) over training. We classify these components into four roles—general, entity, relation-answer, and fact-answer specific—and examine their stability and transitions. Our results show that LLMs initially depend on broad, general-purpose components, which later specialize as training progresses. Once the model reliably predicts answers, some components are repurposed, suggesting an adaptive learning process. Notably, answer-specific attention heads display the highest turnover, whereas FFNs remain stable, continually refining stored knowledge. These insights offer a mechanistic view of knowledge formation in LLMs and have implications for model pruning, optimization, and transparency.

Anthology ID:: 2025.findings-acl.654
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 12633–12653
Language:
URL:: https://aclanthology.org/2025.findings-acl.654/
DOI:: 10.18653/v1/2025.findings-acl.654
Bibkey:
Cite (ACL):: Ahmad Dawar Hakimi, Ali Modarressi, Philipp Wicke, and Hinrich Schuetze. 2025. Time Course MechInterp: Analyzing the Evolution of Components and Knowledge in Large Language Models. In Findings of the Association for Computational Linguistics: ACL 2025, pages 12633–12653, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Time Course MechInterp: Analyzing the Evolution of Components and Knowledge in Large Language Models (Hakimi et al., Findings 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.findings-acl.654.pdf

PDF Cite Search Fix data