Joint Energy-based Model Training for Better Calibrated Natural Language Understanding Models

Tianxing He, Bryan McCann, Caiming Xiong, Ehsan Hosseini-Asl


Abstract
In this work, we explore joint energy-based model (EBM) training during the finetuning of pretrained text encoders (e.g., Roberta) for natural language understanding (NLU) tasks. Our experiments show that EBM training can help the model reach a better calibration that is competitive to strong baselines, with little or no loss in accuracy. We discuss three variants of energy functions (namely scalar, hidden, and sharp-hidden) that can be defined on top of a text encoder, and compare them in experiments. Due to the discreteness of text data, we adopt noise contrastive estimation (NCE) to train the energy-based model. To make NCE training more effective, we train an auto-regressive noise model with the masked language model (MLM) objective.
Anthology ID:
2021.eacl-main.151
Volume:
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Month:
April
Year:
2021
Address:
Online
Editors:
Paola Merlo, Jorg Tiedemann, Reut Tsarfaty
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1754–1761
Language:
URL:
https://aclanthology.org/2021.eacl-main.151
DOI:
10.18653/v1/2021.eacl-main.151
Bibkey:
Cite (ACL):
Tianxing He, Bryan McCann, Caiming Xiong, and Ehsan Hosseini-Asl. 2021. Joint Energy-based Model Training for Better Calibrated Natural Language Understanding Models. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 1754–1761, Online. Association for Computational Linguistics.
Cite (Informal):
Joint Energy-based Model Training for Better Calibrated Natural Language Understanding Models (He et al., EACL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.eacl-main.151.pdf
Code
 salesforce/ebm_calibration_nlu
Data
GLUEQNLI