ELMER: A Non-Autoregressive Pre-trained Language Model for Efficient and Effective Text Generation

Junyi Li, Tianyi Tang, Wayne Xin Zhao, Jian-Yun Nie, Ji-Rong Wen


Abstract
We study the text generation task under the approach of pre-trained language models (PLMs). Typically, an auto-regressive (AR) method is adopted for generating texts in a token-by-token manner. Despite many advantages of AR generation, it usually suffers from inefficient inference. Therefore, non-autoregressive (NAR) models are proposed to generate all target tokens simultaneously. However, NAR models usually generate texts of lower quality due to the absence of token dependency in the output text. In this paper, we propose ELMER: an efficient and effective PLM for NAR text generation to explicitly model the token dependency during NAR generation. By leveraging the early exit technique, ELMER enables the token generations at different layers, according to their prediction confidence (a more confident token will exit at a lower layer). Besides, we propose a novel pre-training objective, Layer Permutation Language Modeling, to pre-train ELMER by permuting the exit layer for each token in sequences. Experiments on three text generation tasks show that ELMER significantly outperforms NAR models and further narrows the performance gap with AR PLMs (ELMER (29.92) vs BART (30.61) ROUGE-L in XSUM) while achieving over 10 times inference speedup.
Anthology ID:
2022.emnlp-main.68
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1044–1058
Language:
URL:
https://aclanthology.org/2022.emnlp-main.68
DOI:
10.18653/v1/2022.emnlp-main.68
Bibkey:
Cite (ACL):
Junyi Li, Tianyi Tang, Wayne Xin Zhao, Jian-Yun Nie, and Ji-Rong Wen. 2022. ELMER: A Non-Autoregressive Pre-trained Language Model for Efficient and Effective Text Generation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 1044–1058, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
ELMER: A Non-Autoregressive Pre-trained Language Model for Efficient and Effective Text Generation (Li et al., EMNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.emnlp-main.68.pdf