Ujwal Gadiraju


pdf bib
Towards Benchmarking the Utility of Explanations for Model Debugging
Maximilian Idahl | Lijun Lyu | Ujwal Gadiraju | Avishek Anand
Proceedings of the First Workshop on Trustworthy Natural Language Processing

Post-hoc explanation methods are an important class of approaches that help understand the rationale underlying a trained model’s decision. But how useful are they for an end-user towards accomplishing a given task? In this vision paper, we argue the need for a benchmark to facilitate evaluations of the utility of post-hoc explanation methods. As a first step to this end, we enumerate desirable properties that such a benchmark should possess for the task of debugging text classifiers. Additionally, we highlight that such a benchmark facilitates not only assessing the effectiveness of explanations but also their efficiency.