Attention Flows are Shapley Value Explanations

Kawin Ethayarajh; Dan Jurafsky

doi:10.18653/v1/2021.acl-short.8

Attention Flows are Shapley Value Explanations

Abstract

Shapley Values, a solution to the credit assignment problem in cooperative game theory, are a popular type of explanation in machine learning, having been used to explain the importance of features, embeddings, and even neurons. In NLP, however, leave-one-out and attention-based explanations still predominate. Can we draw a connection between these different methods? We formally prove that — save for the degenerate case — attention weights and leave-one-out values cannot be Shapley Values. Attention flow is a post-processed variant of attention weights obtained by running the max-flow algorithm on the attention graph. Perhaps surprisingly, we prove that attention flows are indeed Shapley Values, at least at the layerwise level. Given the many desirable theoretical qualities of Shapley Values — which has driven their adoption among the ML community — we argue that NLP practitioners should, when possible, adopt attention flow explanations alongside more traditional ones.

Anthology ID:: 2021.acl-short.8
Volume:: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
Month:: August
Year:: 2021
Address:: Online
Editors:: Chengqing Zong, Fei Xia, Wenjie Li, Roberto Navigli
Venues:: ACL | IJCNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 49–54
Language:
URL:: https://aclanthology.org/2021.acl-short.8
DOI:: 10.18653/v1/2021.acl-short.8
Bibkey:
Cite (ACL):: Kawin Ethayarajh and Dan Jurafsky. 2021. Attention Flows are Shapley Value Explanations. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 49–54, Online. Association for Computational Linguistics.
Cite (Informal):: Attention Flows are Shapley Value Explanations (Ethayarajh & Jurafsky, ACL-IJCNLP 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.acl-short.8.pdf
Video:: https://aclanthology.org/2021.acl-short.8.mp4

PDF Cite Search Video