%0 Conference Proceedings %T A Thorough Evaluation of Task-Specific Pretraining for Summarization %A Rothe, Sascha %A Maynez, Joshua %A Narayan, Shashi %Y Moens, Marie-Francine %Y Huang, Xuanjing %Y Specia, Lucia %Y Yih, Scott Wen-tau %S Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing %D 2021 %8 November %I Association for Computational Linguistics %C Online and Punta Cana, Dominican Republic %F rothe-etal-2021-thorough %X Task-agnostic pretraining objectives like masked language models or corrupted span prediction are applicable to a wide range of NLP downstream tasks (Raffel et al.,2019), but are outperformed by task-specific pretraining objectives like predicting extracted gap sentences on summarization (Zhang et al.,2020). We compare three summarization specific pretraining objectives with the task agnostic corrupted span prediction pretraining in controlled study. We also extend our study to a low resource and zero shot setup, to understand how many training examples are needed in order to ablate the task-specific pretraining without quality loss. Our results show that task-agnostic pretraining is sufficient for most cases which hopefully reduces the need for costly task-specific pretraining. We also report new state-of-the-art number for two summarization task using a T5 model with 11 billion parameters and an optimal beam search length penalty. %R 10.18653/v1/2021.emnlp-main.12 %U https://aclanthology.org/2021.emnlp-main.12 %U https://doi.org/10.18653/v1/2021.emnlp-main.12 %P 140-145