Mechanisms vs. Outcomes: Probing for Syntax Fails to Explain Performance on Targeted Syntactic Evaluations

Ananth Agarwal; Jasper Jian; Christopher D. Manning; Shikhar Murty

doi:10.18653/v1/2025.emnlp-main.1712

Mechanisms vs. Outcomes: Probing for Syntax Fails to Explain Performance on Targeted Syntactic Evaluations

Ananth Agarwal, Jasper Jian, Christopher D Manning, Shikhar Murty

Abstract

Large Language Models (LLMs) exhibit a robust mastery of syntax when processing and generating text. While this suggests internalized understanding of hierarchical syntax and dependency relations, the precise mechanism by which they represent syntactic structure is an open area within interpretability research. Probing provides one way to identify syntactic mechanisms linearly encoded in activations; however, no comprehensive study has yet established whether a model’s probing accuracy reliably predicts its downstream syntactic performance. Adopting a “mechanisms vs. outcomes” framework, we evaluate 32 open-weight transformer models and find that syntactic features extracted via probing fail to predict outcomes of targeted syntax evaluations across English linguistic phenomena. Our results highlight a substantial disconnect between latent syntactic representations found via probing and observable syntactic behaviors in downstream tasks.

Anthology ID:: 2025.emnlp-main.1712
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 33737–33757
Language:
URL:: https://aclanthology.org/2025.emnlp-main.1712/
DOI:: 10.18653/v1/2025.emnlp-main.1712
Bibkey:
Cite (ACL):: Ananth Agarwal, Jasper Jian, Christopher D Manning, and Shikhar Murty. 2025. Mechanisms vs. Outcomes: Probing for Syntax Fails to Explain Performance on Targeted Syntactic Evaluations. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 33737–33757, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Mechanisms vs. Outcomes: Probing for Syntax Fails to Explain Performance on Targeted Syntactic Evaluations (Agarwal et al., EMNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.emnlp-main.1712.pdf
Checklist:: 2025.emnlp-main.1712.checklist.pdf

PDF Cite Search Checklist Fix data