A Fine-grained Interpretability Evaluation Benchmark for Neural NLP

Lijie Wang, Yaozong Shen, Shuyuan Peng, Shuai Zhang, Xinyan Xiao, Hao Liu, Hongxuan Tang, Ying Chen, Hua Wu, Haifeng Wang


Abstract
While there is increasing concern about the interpretability of neural models, the evaluation of interpretability remains an open problem, due to the lack of proper evaluation datasets and metrics. In this paper, we present a novel benchmark to evaluate the interpretability of both neural models and saliency methods. This benchmark covers three representative NLP tasks: sentiment analysis, textual similarity and reading comprehension, each provided with both English and Chinese annotated data. In order to precisely evaluate the interpretability, we provide token-level rationales that are carefully annotated to be sufficient, compact and comprehensive. We also design a new metric, i.e., the consistency between the rationales before and after perturbations, to uniformly evaluate the interpretability on different types of tasks. Based on this benchmark, we conduct experiments on three typical models with three saliency methods, and unveil their strengths and weakness in terms of interpretability. We will release this benchmark (https://www.luge.ai/#/luge/task/taskDetail?taskId=15) and hope it can facilitate the research in building trustworthy systems.
Anthology ID:
2022.conll-1.6
Volume:
Proceedings of the 26th Conference on Computational Natural Language Learning (CoNLL)
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates (Hybrid)
Editors:
Antske Fokkens, Vivek Srikumar
Venue:
CoNLL
SIG:
SIGNLL
Publisher:
Association for Computational Linguistics
Note:
Pages:
70–84
Language:
URL:
https://aclanthology.org/2022.conll-1.6
DOI:
10.18653/v1/2022.conll-1.6
Bibkey:
Cite (ACL):
Lijie Wang, Yaozong Shen, Shuyuan Peng, Shuai Zhang, Xinyan Xiao, Hao Liu, Hongxuan Tang, Ying Chen, Hua Wu, and Haifeng Wang. 2022. A Fine-grained Interpretability Evaluation Benchmark for Neural NLP. In Proceedings of the 26th Conference on Computational Natural Language Learning (CoNLL), pages 70–84, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
Cite (Informal):
A Fine-grained Interpretability Evaluation Benchmark for Neural NLP (Wang et al., CoNLL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.conll-1.6.pdf