Maximilian Idahl
2021
Towards Benchmarking the Utility of Explanations for Model Debugging
Maximilian Idahl
|
Lijun Lyu
|
Ujwal Gadiraju
|
Avishek Anand
Proceedings of the First Workshop on Trustworthy Natural Language Processing
Post-hoc explanation methods are an important class of approaches that help understand the rationale underlying a trained model’s decision. But how useful are they for an end-user towards accomplishing a given task? In this vision paper, we argue the need for a benchmark to facilitate evaluations of the utility of post-hoc explanation methods. As a first step to this end, we enumerate desirable properties that such a benchmark should possess for the task of debugging text classifiers. Additionally, we highlight that such a benchmark facilitates not only assessing the effectiveness of explanations but also their efficiency.