PromptExplainer: Explaining Language Models through Prompt-based Learning

Zijian Feng, Hanzhang Zhou, Zixiao Zhu, Kezhi Mao


Abstract
Pretrained language models have become workhorses for various natural language processing (NLP) tasks, sparking a growing demand for enhanced interpretability and transparency. However, prevailing explanation methods, such as attention-based and gradient-based strategies, largely rely on linear approximations, potentially causing inaccuracies such as accentuating irrelevant input tokens. To mitigate the issue, we develop PromptExplainer, a novel method for explaining language models through prompt-based learning. PromptExplainer aligns the explanation process with the masked language modeling (MLM) task of pretrained language models and leverages the prompt-based learning framework for explanation generation. It disentangles token representations into the explainable embedding space using the MLM head and extracts discriminative features with a verbalizer to generate class-dependent explanations. Extensive experiments demonstrate that PromptExplainer significantly outperforms state-of-the-art explanation methods.
Anthology ID:
2024.findings-eacl.60
Volume:
Findings of the Association for Computational Linguistics: EACL 2024
Month:
March
Year:
2024
Address:
St. Julian’s, Malta
Editors:
Yvette Graham, Matthew Purver
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
882–895
Language:
URL:
https://aclanthology.org/2024.findings-eacl.60
DOI:
Bibkey:
Cite (ACL):
Zijian Feng, Hanzhang Zhou, Zixiao Zhu, and Kezhi Mao. 2024. PromptExplainer: Explaining Language Models through Prompt-based Learning. In Findings of the Association for Computational Linguistics: EACL 2024, pages 882–895, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):
PromptExplainer: Explaining Language Models through Prompt-based Learning (Feng et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-eacl.60.pdf
Software:
 2024.findings-eacl.60.software.zip