Prompt Expansion for Adaptive Text-to-Image Generation

Siddhartha Datta, Alexander Ku, Deepak Ramachandran, Peter Anderson


Abstract
Text-to-image generation models are powerful but difficult to use. Users craft specific prompts to get better images, though the images can be repetitive. This paper proposes the Prompt Expansion framework that helps users generate high-quality, diverse images with less effort. The Prompt Expansion model takes a text query as input and outputs a set of expanded text prompts that are optimized such that when passed to a text-to-image model, they generate a wider variety of appealing images. We conduct a human evaluation study that shows that images generated through Prompt Expansion are more aesthetically pleasing and diverse than those generated by baseline methods. Overall, this paper presents a novel and effective approach to improving the text-to-image generation experience.
Anthology ID:
2024.luhme-long.189
Volume:
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3449–3476
Language:
URL:
https://aclanthology.org/2024.luhme-long.189/
DOI:
10.18653/v1/2024.acl-long.189
Bibkey:
Cite (ACL):
Siddhartha Datta, Alexander Ku, Deepak Ramachandran, and Peter Anderson. 2024. Prompt Expansion for Adaptive Text-to-Image Generation. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3449–3476, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Prompt Expansion for Adaptive Text-to-Image Generation (Datta et al., ACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.acl-long.189.pdf