Evaluating Recurrent Neural Network Explanations

Leila Arras; Ahmed Osman; Klaus-Robert Müller; Wojciech Samek

doi:10.18653/v1/W19-4813

Evaluating Recurrent Neural Network Explanations

Leila Arras, Ahmed Osman, Klaus-Robert Müller, Wojciech Samek

Abstract

Recently, several methods have been proposed to explain the predictions of recurrent neural networks (RNNs), in particular of LSTMs. The goal of these methods is to understand the network’s decisions by assigning to each input variable, e.g., a word, a relevance indicating to which extent it contributed to a particular prediction. In previous works, some of these methods were not yet compared to one another, or were evaluated only qualitatively. We close this gap by systematically and quantitatively comparing these methods in different settings, namely (1) a toy arithmetic task which we use as a sanity check, (2) a five-class sentiment prediction of movie reviews, and besides (3) we explore the usefulness of word relevances to build sentence-level representations. Lastly, using the method that performed best in our experiments, we show how specific linguistic phenomena such as the negation in sentiment analysis reflect in terms of relevance patterns, and how the relevance visualization can help to understand the misclassification of individual samples.

Anthology ID:: W19-4813
Volume:: Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP
Month:: August
Year:: 2019
Address:: Florence, Italy
Editors:: Tal Linzen, Grzegorz Chrupała, Yonatan Belinkov, Dieuwke Hupkes
Venue:: BlackboxNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 113–126
Language:
URL:: https://aclanthology.org/W19-4813/
DOI:: 10.18653/v1/W19-4813
Bibkey:
Cite (ACL):: Leila Arras, Ahmed Osman, Klaus-Robert Müller, and Wojciech Samek. 2019. Evaluating Recurrent Neural Network Explanations. In Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pages 113–126, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):: Evaluating Recurrent Neural Network Explanations (Arras et al., BlackboxNLP 2019)
Copy Citation:
PDF:: https://aclanthology.org/W19-4813.pdf
Code: ArrasL/LRP_for_LSTM

PDF Cite Search Code Fix data