ZIP: Quantifying Which Words Matter in Zero-Shot Instructional Prompts

Nikta Gohari Sadr; Sangmitra Madhusudan; Arash Asgari; Hassan Sajjad; Laleh Seyyed-Kalantari; Ali Emami

ZIP: Quantifying Which Words Matter in Zero-Shot Instructional Prompts

Nikta Gohari Sadr, Sangmitra Madhusudan, Arash Asgari, Hassan Sajjad, Laleh Seyyed-Kalantari, Ali Emami

Abstract

While zero-shot instructional prompts like "Let’s think step-by-step” have revolutionized Large Language Model performance, we lack systematic understanding of why: which specific words drive their effectiveness, and how do these patterns vary across tasks and models? We introduce the ZIP score (Zero-shot Importance of Perturbation), a metric that quantifies individual word importance through controlled, semantically meaningful perturbations. To enable rigorous evaluation, we also introduce the first ground-truth benchmark for prompt interpretability, a set of validation prompts with predetermined keywords where ZIP achieves 95.8% accuracy compared to 65.8% for LIME. Analyzing six flagship models across seven prompts and multiple task domains, we find that word importance is task-dependent ("step-by-step” dominates mathematical reasoning; "think” matters more for common-sense tasks), varies systematically across model families, and correlates inversely with model performance, suggesting prompts have greatest impact on tasks where models struggle. Our findings advance prompt science, providing both practical guidance for prompt engineering and theoretical understanding of how instructional language shapes model behavior.

Anthology ID:: 2026.starsem-conference.30
Volume:: Proceedings of the 15th Joint Conference on Lexical and Computational Semantics (*SEM 2026)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Saif M. Mohammad, Nedjma Ousidhoum
Venues:: *SEM | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 428–453
Language:
URL:: https://aclanthology.org/2026.starsem-conference.30/
DOI:
Bibkey:
Cite (ACL):: Nikta Gohari Sadr, Sangmitra Madhusudan, Arash Asgari, Hassan Sajjad, Laleh Seyyed-Kalantari, and Ali Emami. 2026. ZIP: Quantifying Which Words Matter in Zero-Shot Instructional Prompts. In Proceedings of the 15th Joint Conference on Lexical and Computational Semantics (*SEM 2026), pages 428–453, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: ZIP: Quantifying Which Words Matter in Zero-Shot Instructional Prompts (Sadr et al., *SEM 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.starsem-conference.30.pdf

PDF Cite Search Fix data