Calibrating Zero-shot Cross-lingual (Un-)structured Predictions

Zhengping Jiang, Anqi Liu, Benjamin Van Durme


Abstract
We investigate model calibration in the setting of zero-shot cross-lingual transfer with large-scale pre-trained language models. The level of model calibration is an important metric for evaluating the trustworthiness of predictive models. There exists an essential need for model calibration when natural language models are deployed in critical tasks. We study different post-training calibration methods in structured and unstructured prediction tasks. We find that models trained with data from the source language become less calibrated when applied to the target language and that calibration errors increase with intrinsic task difficulty and relative sparsity of training data. Moreover, we observe a potential connection between the level of calibration error and an earlier proposed measure of the distance from English to other languages. Finally, our comparison demonstrates that among other methods Temperature Scaling (TS) generalizes well to distant languages, but TS fails to calibrate more complex confidence estimation in structured predictions compared to more expressive alternatives like Gaussian Process Calibration.
Anthology ID:
2022.emnlp-main.170
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2648–2674
Language:
URL:
https://aclanthology.org/2022.emnlp-main.170
DOI:
10.18653/v1/2022.emnlp-main.170
Bibkey:
Cite (ACL):
Zhengping Jiang, Anqi Liu, and Benjamin Van Durme. 2022. Calibrating Zero-shot Cross-lingual (Un-)structured Predictions. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2648–2674, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Calibrating Zero-shot Cross-lingual (Un-)structured Predictions (Jiang et al., EMNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.emnlp-main.170.pdf