Calibration Meets Explanation: A Simple and Effective Approach for Model Confidence Estimates

Dongfang Li, Baotian Hu, Qingcai Chen


Abstract
Calibration strengthens the trustworthiness of black-box models by producing better accurate confidence estimates on given examples. However, little is known about if model explanations can help confidence calibration. Intuitively, humans look at important features attributions and decide whether the model is trustworthy. Similarly, the explanations may tell us when the model might know and when it does not. Inspired by this, we propose a method named CME that leverages model explanations to make the model less confident with non-inductive attributions. The idea is that when the model is not highly confident, it is difficult to identify strong indications of any class, and the tokens accordingly do not have high attribution scores for any class and vice versa. We conduct extensive experiments on six datasets with two popular pre-trained language models in the in-domain and out-of-domain settings. The results show that CME improves calibration performance in all settings. The expected calibration errors are further reduced when combined with temperature scaling. Our findings highlight that model explanations can help calibrate posterior estimates.
Anthology ID:
2022.emnlp-main.178
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2775–2784
Language:
URL:
https://aclanthology.org/2022.emnlp-main.178
DOI:
10.18653/v1/2022.emnlp-main.178
Bibkey:
Cite (ACL):
Dongfang Li, Baotian Hu, and Qingcai Chen. 2022. Calibration Meets Explanation: A Simple and Effective Approach for Model Confidence Estimates. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2775–2784, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Calibration Meets Explanation: A Simple and Effective Approach for Model Confidence Estimates (Li et al., EMNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.emnlp-main.178.pdf