Scheduled Sampling Based on Decoding Steps for Neural Machine Translation

Yijin Liu; Fandong Meng; Yufeng Chen; Jinan Xu (徐金安); Jie Zhou

doi:10.18653/v1/2021.emnlp-main.264

Scheduled Sampling Based on Decoding Steps for Neural Machine Translation

Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie Zhou

Abstract

Scheduled sampling is widely used to mitigate the exposure bias problem for neural machine translation. Its core motivation is to simulate the inference scene during training by replacing ground-truth tokens with predicted tokens, thus bridging the gap between training and inference. However, vanilla scheduled sampling is merely based on training steps and equally treats all decoding steps. Namely, it simulates an inference scene with uniform error rates, which disobeys the real inference scene, where larger decoding steps usually have higher error rates due to error accumulations. To alleviate the above discrepancy, we propose scheduled sampling methods based on decoding steps, increasing the selection chance of predicted tokens with the growth of decoding steps. Consequently, we can more realistically simulate the inference scene during training, thus better bridging the gap between training and inference. Moreover, we investigate scheduled sampling based on both training steps and decoding steps for further improvements. Experimentally, our approaches significantly outperform the Transformer baseline and vanilla scheduled sampling on three large-scale WMT tasks. Additionally, our approaches also generalize well to the text summarization task on two popular benchmarks.

Anthology ID:: 2021.emnlp-main.264
Volume:: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2021
Address:: Online and Punta Cana, Dominican Republic
Editors:: Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3285–3296
Language:
URL:: https://aclanthology.org/2021.emnlp-main.264/
DOI:: 10.18653/v1/2021.emnlp-main.264
Bibkey:
Cite (ACL):: Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, and Jie Zhou. 2021. Scheduled Sampling Based on Decoding Steps for Neural Machine Translation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 3285–3296, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):: Scheduled Sampling Based on Decoding Steps for Neural Machine Translation (Liu et al., EMNLP 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.emnlp-main.264.pdf
Video:: https://aclanthology.org/2021.emnlp-main.264.mp4

PDF Cite Search Video Fix data