Joint Energy-based Model Training for Better Calibrated Natural Language Understanding Models

Tianxing He; Bryan McCann; Caiming Xiong; Ehsan Hosseini-Asl

doi:10.18653/v1/2021.eacl-main.151

Joint Energy-based Model Training for Better Calibrated Natural Language Understanding Models

Tianxing He, Bryan McCann, Caiming Xiong, Ehsan Hosseini-Asl

Abstract

In this work, we explore joint energy-based model (EBM) training during the finetuning of pretrained text encoders (e.g., Roberta) for natural language understanding (NLU) tasks. Our experiments show that EBM training can help the model reach a better calibration that is competitive to strong baselines, with little or no loss in accuracy. We discuss three variants of energy functions (namely scalar, hidden, and sharp-hidden) that can be defined on top of a text encoder, and compare them in experiments. Due to the discreteness of text data, we adopt noise contrastive estimation (NCE) to train the energy-based model. To make NCE training more effective, we train an auto-regressive noise model with the masked language model (MLM) objective.

Anthology ID:: 2021.eacl-main.151
Volume:: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Month:: April
Year:: 2021
Address:: Online
Editors:: Paola Merlo, Jorg Tiedemann, Reut Tsarfaty
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1754–1761
Language:
URL:: https://aclanthology.org/2021.eacl-main.151/
DOI:: 10.18653/v1/2021.eacl-main.151
Bibkey:
Cite (ACL):: Tianxing He, Bryan McCann, Caiming Xiong, and Ehsan Hosseini-Asl. 2021. Joint Energy-based Model Training for Better Calibrated Natural Language Understanding Models. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 1754–1761, Online. Association for Computational Linguistics.
Cite (Informal):: Joint Energy-based Model Training for Better Calibrated Natural Language Understanding Models (He et al., EACL 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.eacl-main.151.pdf

PDF Cite Search Fix data