Pre-Training Transformers as Energy-Based Cloze Models

Kevin Clark; Minh-Thang Luong; Quoc Le; Christopher D. Manning

doi:10.18653/v1/2020.emnlp-main.20

Pre-Training Transformers as Energy-Based Cloze Models

Kevin Clark, Minh-Thang Luong, Quoc Le, Christopher D. Manning

Abstract

We introduce Electric, an energy-based cloze model for representation learning over text. Like BERT, it is a conditional generative model of tokens given their contexts. However, Electric does not use masking or output a full distribution over tokens that could occur in a context. Instead, it assigns a scalar energy score to each input token indicating how likely it is given its context. We train Electric using an algorithm based on noise-contrastive estimation and elucidate how this learning objective is closely related to the recently proposed ELECTRA pre-training method. Electric performs well when transferred to downstream tasks and is particularly effective at producing likelihood scores for text: it re-ranks speech recognition n-best lists better than language models and much faster than masked language models. Furthermore, it offers a clearer and more principled view of what ELECTRA learns during pre-training.

Anthology ID:: 2020.emnlp-main.20
Volume:: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Month:: November
Year:: 2020
Address:: Online
Editors:: Bonnie Webber, Trevor Cohn, Yulan He, Yang Liu
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 285–294
Language:
URL:: https://aclanthology.org/2020.emnlp-main.20
DOI:: 10.18653/v1/2020.emnlp-main.20
Bibkey:
Cite (ACL):: Kevin Clark, Minh-Thang Luong, Quoc Le, and Christopher D. Manning. 2020. Pre-Training Transformers as Energy-Based Cloze Models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 285–294, Online. Association for Computational Linguistics.
Cite (Informal):: Pre-Training Transformers as Energy-Based Cloze Models (Clark et al., EMNLP 2020)
Copy Citation:
PDF:: https://aclanthology.org/2020.emnlp-main.20.pdf
Video:: https://slideslive.com/38939095
Code: google-research/electra
Data: GLUE, LibriSpeech, OpenWebText, WebText

PDF Cite Search Code Video