The Effect of Pretraining on Extractive Summarization for Scientific Documents
Yash Gupta, Pawan Sasanka Ammanamanchi, Shikha Bordia, Arjun Manoharan, Deepak Mittal, Ramakanth Pasunuru, Manish Shrivastava, Maneesh Singh, Mohit Bansal, Preethi Jyothi
Abstract
Large pretrained models have seen enormous success in extractive summarization tasks. In this work, we investigate the influence of pretraining on a BERT-based extractive summarization system for scientific documents. We derive significant performance improvements using an intermediate pretraining step that leverages existing summarization datasets and report state-of-the-art results on a recently released scientific summarization dataset, SciTLDR. We systematically analyze the intermediate pretraining step by varying the size and domain of the pretraining corpus, changing the length of the input sequence in the target task and varying target tasks. We also investigate how intermediate pretraining interacts with contextualized word embeddings trained on different domains.- Anthology ID:
- 2021.sdp-1.9
- Volume:
- Proceedings of the Second Workshop on Scholarly Document Processing
- Month:
- June
- Year:
- 2021
- Address:
- Online
- Editors:
- Iz Beltagy, Arman Cohan, Guy Feigenblat, Dayne Freitag, Tirthankar Ghosal, Keith Hall, Drahomira Herrmannova, Petr Knoth, Kyle Lo, Philipp Mayr, Robert M. Patton, Michal Shmueli-Scheuer, Anita de Waard, Kuansan Wang, Lucy Lu Wang
- Venue:
- sdp
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 73–82
- Language:
- URL:
- https://aclanthology.org/2021.sdp-1.9
- DOI:
- 10.18653/v1/2021.sdp-1.9
- Bibkey:
- Cite (ACL):
- Yash Gupta, Pawan Sasanka Ammanamanchi, Shikha Bordia, Arjun Manoharan, Deepak Mittal, Ramakanth Pasunuru, Manish Shrivastava, Maneesh Singh, Mohit Bansal, and Preethi Jyothi. 2021. The Effect of Pretraining on Extractive Summarization for Scientific Documents. In Proceedings of the Second Workshop on Scholarly Document Processing, pages 73–82, Online. Association for Computational Linguistics.
- Cite (Informal):
- The Effect of Pretraining on Extractive Summarization for Scientific Documents (Gupta et al., sdp 2021)
- Copy Citation:
- PDF:
- https://aclanthology.org/2021.sdp-1.9.pdf
Export citation
@inproceedings{gupta-etal-2021-effect, title = "The Effect of Pretraining on Extractive Summarization for Scientific Documents", author = "Gupta, Yash and Ammanamanchi, Pawan Sasanka and Bordia, Shikha and Manoharan, Arjun and Mittal, Deepak and Pasunuru, Ramakanth and Shrivastava, Manish and Singh, Maneesh and Bansal, Mohit and Jyothi, Preethi", editor = "Beltagy, Iz and Cohan, Arman and Feigenblat, Guy and Freitag, Dayne and Ghosal, Tirthankar and Hall, Keith and Herrmannova, Drahomira and Knoth, Petr and Lo, Kyle and Mayr, Philipp and Patton, Robert M. and Shmueli-Scheuer, Michal and de Waard, Anita and Wang, Kuansan and Wang, Lucy Lu", booktitle = "Proceedings of the Second Workshop on Scholarly Document Processing", month = jun, year = "2021", address = "Online", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2021.sdp-1.9", doi = "10.18653/v1/2021.sdp-1.9", pages = "73--82", abstract = "Large pretrained models have seen enormous success in extractive summarization tasks. In this work, we investigate the influence of pretraining on a BERT-based extractive summarization system for scientific documents. We derive significant performance improvements using an intermediate pretraining step that leverages existing summarization datasets and report state-of-the-art results on a recently released scientific summarization dataset, SciTLDR. We systematically analyze the intermediate pretraining step by varying the size and domain of the pretraining corpus, changing the length of the input sequence in the target task and varying target tasks. We also investigate how intermediate pretraining interacts with contextualized word embeddings trained on different domains.", }
<?xml version="1.0" encoding="UTF-8"?> <modsCollection xmlns="http://www.loc.gov/mods/v3"> <mods ID="gupta-etal-2021-effect"> <titleInfo> <title>The Effect of Pretraining on Extractive Summarization for Scientific Documents</title> </titleInfo> <name type="personal"> <namePart type="given">Yash</namePart> <namePart type="family">Gupta</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Pawan</namePart> <namePart type="given">Sasanka</namePart> <namePart type="family">Ammanamanchi</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Shikha</namePart> <namePart type="family">Bordia</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Arjun</namePart> <namePart type="family">Manoharan</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Deepak</namePart> <namePart type="family">Mittal</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Ramakanth</namePart> <namePart type="family">Pasunuru</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Manish</namePart> <namePart type="family">Shrivastava</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Maneesh</namePart> <namePart type="family">Singh</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Mohit</namePart> <namePart type="family">Bansal</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Preethi</namePart> <namePart type="family">Jyothi</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <originInfo> <dateIssued>2021-06</dateIssued> </originInfo> <typeOfResource>text</typeOfResource> <relatedItem type="host"> <titleInfo> <title>Proceedings of the Second Workshop on Scholarly Document Processing</title> </titleInfo> <name type="personal"> <namePart type="given">Iz</namePart> <namePart type="family">Beltagy</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Arman</namePart> <namePart type="family">Cohan</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Guy</namePart> <namePart type="family">Feigenblat</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Dayne</namePart> <namePart type="family">Freitag</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Tirthankar</namePart> <namePart type="family">Ghosal</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Keith</namePart> <namePart type="family">Hall</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Drahomira</namePart> <namePart type="family">Herrmannova</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Petr</namePart> <namePart type="family">Knoth</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Kyle</namePart> <namePart type="family">Lo</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Philipp</namePart> <namePart type="family">Mayr</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Robert</namePart> <namePart type="given">M</namePart> <namePart type="family">Patton</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Michal</namePart> <namePart type="family">Shmueli-Scheuer</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Anita</namePart> <namePart type="family">de Waard</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Kuansan</namePart> <namePart type="family">Wang</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Lucy</namePart> <namePart type="given">Lu</namePart> <namePart type="family">Wang</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <originInfo> <publisher>Association for Computational Linguistics</publisher> <place> <placeTerm type="text">Online</placeTerm> </place> </originInfo> <genre authority="marcgt">conference publication</genre> </relatedItem> <abstract>Large pretrained models have seen enormous success in extractive summarization tasks. In this work, we investigate the influence of pretraining on a BERT-based extractive summarization system for scientific documents. We derive significant performance improvements using an intermediate pretraining step that leverages existing summarization datasets and report state-of-the-art results on a recently released scientific summarization dataset, SciTLDR. We systematically analyze the intermediate pretraining step by varying the size and domain of the pretraining corpus, changing the length of the input sequence in the target task and varying target tasks. We also investigate how intermediate pretraining interacts with contextualized word embeddings trained on different domains.</abstract> <identifier type="citekey">gupta-etal-2021-effect</identifier> <identifier type="doi">10.18653/v1/2021.sdp-1.9</identifier> <location> <url>https://aclanthology.org/2021.sdp-1.9</url> </location> <part> <date>2021-06</date> <extent unit="page"> <start>73</start> <end>82</end> </extent> </part> </mods> </modsCollection>
%0 Conference Proceedings %T The Effect of Pretraining on Extractive Summarization for Scientific Documents %A Gupta, Yash %A Ammanamanchi, Pawan Sasanka %A Bordia, Shikha %A Manoharan, Arjun %A Mittal, Deepak %A Pasunuru, Ramakanth %A Shrivastava, Manish %A Singh, Maneesh %A Bansal, Mohit %A Jyothi, Preethi %Y Beltagy, Iz %Y Cohan, Arman %Y Feigenblat, Guy %Y Freitag, Dayne %Y Ghosal, Tirthankar %Y Hall, Keith %Y Herrmannova, Drahomira %Y Knoth, Petr %Y Lo, Kyle %Y Mayr, Philipp %Y Patton, Robert M. %Y Shmueli-Scheuer, Michal %Y de Waard, Anita %Y Wang, Kuansan %Y Wang, Lucy Lu %S Proceedings of the Second Workshop on Scholarly Document Processing %D 2021 %8 June %I Association for Computational Linguistics %C Online %F gupta-etal-2021-effect %X Large pretrained models have seen enormous success in extractive summarization tasks. In this work, we investigate the influence of pretraining on a BERT-based extractive summarization system for scientific documents. We derive significant performance improvements using an intermediate pretraining step that leverages existing summarization datasets and report state-of-the-art results on a recently released scientific summarization dataset, SciTLDR. We systematically analyze the intermediate pretraining step by varying the size and domain of the pretraining corpus, changing the length of the input sequence in the target task and varying target tasks. We also investigate how intermediate pretraining interacts with contextualized word embeddings trained on different domains. %R 10.18653/v1/2021.sdp-1.9 %U https://aclanthology.org/2021.sdp-1.9 %U https://doi.org/10.18653/v1/2021.sdp-1.9 %P 73-82
Markdown (Informal)
[The Effect of Pretraining on Extractive Summarization for Scientific Documents](https://aclanthology.org/2021.sdp-1.9) (Gupta et al., sdp 2021)
- The Effect of Pretraining on Extractive Summarization for Scientific Documents (Gupta et al., sdp 2021)
ACL
- Yash Gupta, Pawan Sasanka Ammanamanchi, Shikha Bordia, Arjun Manoharan, Deepak Mittal, Ramakanth Pasunuru, Manish Shrivastava, Maneesh Singh, Mohit Bansal, and Preethi Jyothi. 2021. The Effect of Pretraining on Extractive Summarization for Scientific Documents. In Proceedings of the Second Workshop on Scholarly Document Processing, pages 73–82, Online. Association for Computational Linguistics.