Unveiling Language Competence Neurons: A Psycholinguistic Approach to Model Interpretability

Xufeng Duan; Xinyu Zhou; Bei Xiao; Zhenguang Cai

Unveiling Language Competence Neurons: A Psycholinguistic Approach to Model Interpretability

Xufeng Duan, Xinyu Zhou, Bei Xiao, Zhenguang Cai

Abstract

As large language models (LLMs) advance in their linguistic capacity, understanding how they capture aspects of language competence remains a significant challenge. This study therefore employs psycholinguistic paradigms, which are well-suited for probing deeper cognitive aspects of language processing, to explore neuron-level representations in language model across three tasks: sound-shape association, sound-gender association, and implicit causality. Our findings indicate that while GPT-2-XL struggles with the sound-shape task, it demonstrates human-like abilities in both sound-gender association and implicit causality. Targeted neuron ablation and activation manipulation reveal a crucial relationship: When GPT-2-XL displays a linguistic ability, specific neurons correspond to that competence; conversely, the absence of such an ability indicates a lack of specialized neurons. This study is the first to utilize psycholinguistic experiments to investigate deep language competence at the neuron level, providing a new level of granularity in model interpretability and insights into the internal mechanisms driving language ability in the transformer-based LLM.

Anthology ID:: 2025.coling-main.677
Volume:: Proceedings of the 31st International Conference on Computational Linguistics
Month:: January
Year:: 2025
Address:: Abu Dhabi, UAE
Editors:: Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:: COLING
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 10148–10157
Language:
URL:: https://aclanthology.org/2025.coling-main.677/
DOI:
Bibkey:
Cite (ACL):: Xufeng Duan, Xinyu Zhou, Bei Xiao, and Zhenguang Cai. 2025. Unveiling Language Competence Neurons: A Psycholinguistic Approach to Model Interpretability. In Proceedings of the 31st International Conference on Computational Linguistics, pages 10148–10157, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):: Unveiling Language Competence Neurons: A Psycholinguistic Approach to Model Interpretability (Duan et al., COLING 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.coling-main.677.pdf

PDF Cite Search Fix data