Exposure Bias versus Self-Recovery: Are Distortions Really Incremental for Autoregressive Text Generation?

Tianxing He; Jingzhao Zhang; Zhiming Zhou; James Glass

doi:10.18653/v1/2021.emnlp-main.415

Exposure Bias versus Self-Recovery: Are Distortions Really Incremental for Autoregressive Text Generation?

Tianxing He, Jingzhao Zhang, Zhiming Zhou, James Glass

Abstract

Exposure bias has been regarded as a central problem for auto-regressive language models (LM). It claims that teacher forcing would cause the test-time generation to be incrementally distorted due to the training-generation discrepancy. Although a lot of algorithms have been proposed to avoid teacher forcing and therefore alleviate exposure bias, there is little work showing how serious the exposure bias problem actually is. In this work, we focus on the task of open-ended language generation, propose metrics to quantify the impact of exposure bias in the aspects of quality, diversity, and consistency. Our key intuition is that if we feed ground-truth data prefixes (instead of prefixes generated by the model itself) into the model and ask it to continue the generation, the performance should become much better because the training-generation discrepancy in the prefix is removed. Both automatic and human evaluations are conducted in our experiments. On the contrary to the popular belief in exposure bias, we find that the the distortion induced by the prefix discrepancy is limited, and does not seem to be incremental during the generation. Moreover, our analysis reveals an interesting self-recovery ability of the LM, which we hypothesize to be countering the harmful effects from exposure bias.

Anthology ID:: 2021.emnlp-main.415
Volume:: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2021
Address:: Online and Punta Cana, Dominican Republic
Editors:: Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 5087–5102
Language:
URL:: https://aclanthology.org/2021.emnlp-main.415/
DOI:: 10.18653/v1/2021.emnlp-main.415
Bibkey:
Cite (ACL):: Tianxing He, Jingzhao Zhang, Zhiming Zhou, and James Glass. 2021. Exposure Bias versus Self-Recovery: Are Distortions Really Incremental for Autoregressive Text Generation?. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5087–5102, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):: Exposure Bias versus Self-Recovery: Are Distortions Really Incremental for Autoregressive Text Generation? (He et al., EMNLP 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.emnlp-main.415.pdf
Video:: https://aclanthology.org/2021.emnlp-main.415.mp4

PDF Cite Search Video Fix data