%0 Conference Proceedings %T D2S: Document-to-Slide Generation Via Query-Based Text Summarization %A Sun, Edward %A Hou, Yufang %A Wang, Dakuo %A Zhang, Yunfeng %A Wang, Nancy X. R. %Y Toutanova, Kristina %Y Rumshisky, Anna %Y Zettlemoyer, Luke %Y Hakkani-Tur, Dilek %Y Beltagy, Iz %Y Bethard, Steven %Y Cotterell, Ryan %Y Chakraborty, Tanmoy %Y Zhou, Yichao %S Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies %D 2021 %8 June %I Association for Computational Linguistics %C Online %F sun-etal-2021-d2s %X Presentations are critical for communication in all areas of our lives, yet the creation of slide decks is often tedious and time-consuming. There has been limited research aiming to automate the document-to-slides generation process and all face a critical challenge: no publicly available dataset for training and benchmarking. In this work, we first contribute a new dataset, SciDuet, consisting of pairs of papers and their corresponding slides decks from recent years’ NLP and ML conferences (e.g., ACL). Secondly, we present D2S, a novel system that tackles the document-to-slides task with a two-step approach: 1) Use slide titles to retrieve relevant and engaging text, figures, and tables; 2) Summarize the retrieved context into bullet points with long-form question answering. Our evaluation suggests that long-form QA outperforms state-of-the-art summarization baselines on both automated ROUGE metrics and qualitative human evaluation. %R 10.18653/v1/2021.naacl-main.111 %U https://aclanthology.org/2021.naacl-main.111 %U https://doi.org/10.18653/v1/2021.naacl-main.111 %P 1405-1418