A Comparative Study of Faithfulness Metrics for Model Interpretability Methods

Chun Sik Chan, Huanqi Kong, Liang Guanqing


Abstract
Interpretable methods to reveal the internal reasoning processes behind machine learning models have attracted increasing attention in recent years. To quantify the extent to which the identified interpretations truly reflect the intrinsic decision-making mechanisms, various faithfulness evaluation metrics have been proposed. However, we find that different faithfulness metrics show conflicting preferences when comparing different interpretations. Motivated by this observation, we aim to conduct a comprehensive and comparative study of the widely adopted faithfulness metrics. In particular, we introduce two assessment dimensions, namely diagnosticity and complexity. Diagnosticity refers to the degree to which the faithfulness metric favors relatively faithful interpretations over randomly generated ones, and complexity is measured by the average number of model forward passes. According to the experimental results, we find that sufficiency and comprehensiveness metrics have higher diagnosticity and lower complexity than the other faithfulness metrics.
Anthology ID:
2022.acl-long.345
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5029–5038
Language:
URL:
https://aclanthology.org/2022.acl-long.345
DOI:
10.18653/v1/2022.acl-long.345
Bibkey:
Cite (ACL):
Chun Sik Chan, Huanqi Kong, and Liang Guanqing. 2022. A Comparative Study of Faithfulness Metrics for Model Interpretability Methods. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5029–5038, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
A Comparative Study of Faithfulness Metrics for Model Interpretability Methods (Chan et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.345.pdf
Data
IMDb Movie ReviewsSST