Enhancing Presentation Slide Generation by LLMs with a Multi-Staged End-to-End Approach

Sambaran Bandyopadhyay, Himanshu Maheshwari, Anandhavelu Natarajan, Apoorv Saxena


Abstract
Generating presentation slides from a long document with multimodal elements such as text and images is an important task. This is time consuming and needs domain expertise if done manually. Existing approaches for generating a rich presentation from a document are often semi-automatic or only put a flat summary into the slides ignoring the importance of a good narrative. In this paper, we address this research gap by proposing a multi-staged end-to-end model which uses a combination of LLM and VLM. We have experimentally shown that compared to applying LLMs directly with state-of-the-art prompting, our proposed multi-staged solution is better in terms of automated metrics and human evaluation.
Anthology ID:
2024.inlg-main.18
Volume:
Proceedings of the 17th International Natural Language Generation Conference
Month:
September
Year:
2024
Address:
Tokyo, Japan
Editors:
Saad Mahamood, Nguyen Le Minh, Daphne Ippolito
Venue:
INLG
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
222–229
Language:
URL:
https://aclanthology.org/2024.inlg-main.18
DOI:
Bibkey:
Cite (ACL):
Sambaran Bandyopadhyay, Himanshu Maheshwari, Anandhavelu Natarajan, and Apoorv Saxena. 2024. Enhancing Presentation Slide Generation by LLMs with a Multi-Staged End-to-End Approach. In Proceedings of the 17th International Natural Language Generation Conference, pages 222–229, Tokyo, Japan. Association for Computational Linguistics.
Cite (Informal):
Enhancing Presentation Slide Generation by LLMs with a Multi-Staged End-to-End Approach (Bandyopadhyay et al., INLG 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.inlg-main.18.pdf
Supplementary attachment:
 2024.inlg-main.18.Supplementary_Attachment.pdf