Faithful and Plausible Natural Language Explanations for Image Classification: A Pipeline Approach

Adam Wojciechowski, Mateusz Lango, Ondrej Dusek


Abstract
Existing explanation methods for image classification struggle to provide faithful and plausible explanations. This paper addresses this issue by proposing a post-hoc natural language explanation method that can be applied to any CNN-based classifier without altering its training process or affecting predictive performance. By analysing influential neurons and the corresponding activation maps, the method generates a faithful description of the classifier’s decision process in the form of a structured meaning representation, which is then converted into text by a language model. Through this pipeline approach, the generated explanations are grounded in the neural network architecture, providing accurate insight into the classification process while remaining accessible to non-experts. Experimental results show that the NLEs constructed by our method are significantly more plausible and faithful than baselines. In particular, user interventions in the neural network structure (masking of neurons) are three times more effective.
Anthology ID:
2024.findings-emnlp.130
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2024
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2340–2351
Language:
URL:
https://aclanthology.org/2024.findings-emnlp.130
DOI:
Bibkey:
Cite (ACL):
Adam Wojciechowski, Mateusz Lango, and Ondrej Dusek. 2024. Faithful and Plausible Natural Language Explanations for Image Classification: A Pipeline Approach. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 2340–2351, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Faithful and Plausible Natural Language Explanations for Image Classification: A Pipeline Approach (Wojciechowski et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-emnlp.130.pdf