A Hierarchical Explanation Generation Method Based on Feature Interaction Detection

Yiming Ju, Yuanzhe Zhang, Kang Liu, Jun Zhao


Abstract
The opaqueness of deep NLP models has motivated efforts to explain how deep models predict. Recently, work has introduced hierarchical attribution explanations, which calculate attribution scores for compositional text hierarchically to capture compositional semantics. Existing work on hierarchical attributions tends to limit the text groups to a continuous text span, which we call the connecting rule. While easy for humans to read, limiting the attribution unit to a continuous span might lose important long-distance feature interactions for reflecting model predictions. In this work, we introduce a novel strategy for capturing feature interactions and employ it to build hierarchical explanations without the connecting rule. The proposed method can convert ubiquitous non-hierarchical explanations (e.g., LIME) into their corresponding hierarchical versions. Experimental results show the effectiveness of our approach in building high-quality hierarchical explanations.
Anthology ID:
2023.findings-acl.798
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
12600–12611
Language:
URL:
https://aclanthology.org/2023.findings-acl.798
DOI:
10.18653/v1/2023.findings-acl.798
Bibkey:
Cite (ACL):
Yiming Ju, Yuanzhe Zhang, Kang Liu, and Jun Zhao. 2023. A Hierarchical Explanation Generation Method Based on Feature Interaction Detection. In Findings of the Association for Computational Linguistics: ACL 2023, pages 12600–12611, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
A Hierarchical Explanation Generation Method Based on Feature Interaction Detection (Ju et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-acl.798.pdf