Distance-aware Calibration for Pre-trained Language Models

Alberto Gasparin, Gianluca Detommaso


Abstract
Language Models for text classification often produce overconfident predictions for both in-distribution and out-of-distribution samples, i.e., the model’s output probabilities do not match their accuracy. Prior work showed that simple post-hoc approaches are effective for mitigating this issue, but are not robust in noisy settings, e.g., when the distribution shift is caused by spelling mistakes. In this work, we propose Distance Aware Calibration (DAC), a post-hoc approach that changes the confidence scores of a Language Model leveraging the distance between new samples been evaluated and the in-domain training set. We show that using DAC on top of a Language Model can improve in-domain calibration, robustness to different kind of distribution shift and also the model’s ability to detect out-of-distribution samples. We provide an extensive evaluation on common text classification benchmark for both calibration and out-of-distribution detection tasks.
Anthology ID:
2024.findings-emnlp.725
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2024
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
12434–12447
Language:
URL:
https://aclanthology.org/2024.findings-emnlp.725
DOI:
Bibkey:
Cite (ACL):
Alberto Gasparin and Gianluca Detommaso. 2024. Distance-aware Calibration for Pre-trained Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 12434–12447, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Distance-aware Calibration for Pre-trained Language Models (Gasparin & Detommaso, Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-emnlp.725.pdf