Interpret and Improve In-Context Learning via the Lens of Input-Label Mappings

Chenghao Sun; Zhen Huang; Yonggang Zhang; Le Lu; Houqiang Li; Xinmei Tian; Xu Shen; Jieping Ye

doi:10.18653/v1/2025.acl-long.196

Interpret and Improve In-Context Learning via the Lens of Input-Label Mappings

Chenghao Sun, Zhen Huang, Yonggang Zhang, Le Lu, Houqiang Li, Xinmei Tian, Xu Shen, Jieping Ye

Abstract

Large language models (LLMs) excel at downstream NLP tasks through in-context learning (ICL) with a few demonstrations of input–label pairs. However, the internal mechanisms behind ICL remain under-explored, particularly the mappings between inputs and labels. In this work, we reverse-engineer ICL by examining input-label mappings: what they are within LLMs, where they function, and how LLMs utilize them. (1) what: We discover input-label mappings stored within a few specific layers in the form of principal components (PCs), which capture human-interpretable and task-related words. (2) where: We propose a PC patching approach to identify the modules where input-label mappings function. Specifically, PC patching automatically crafts counterfactual representations using identified semantic PCs, rather than manually designing counterfactual text, to suppress the behavior related to LLM capability for ICL-related modules. Utilizing PC patching, we identify LLMs apply input-label mappings in a small fraction of attention heads. (3) how: We observe and verify that the identified key heads utilize input-label mappings from demonstrations to generate target labels for new queries. Based on these discoveries, we further show that precisely fine-tuning key ICL-related modules leads to significant improvements across diverse tasks.

Anthology ID:: 2025.acl-long.196
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3873–3895
Language:
URL:: https://aclanthology.org/2025.acl-long.196/
DOI:: 10.18653/v1/2025.acl-long.196
Bibkey:
Cite (ACL):: Chenghao Sun, Zhen Huang, Yonggang Zhang, Le Lu, Houqiang Li, Xinmei Tian, Xu Shen, and Jieping Ye. 2025. Interpret and Improve In-Context Learning via the Lens of Input-Label Mappings. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3873–3895, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Interpret and Improve In-Context Learning via the Lens of Input-Label Mappings (Sun et al., ACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.acl-long.196.pdf

PDF Cite Search Fix data