Style-Compress: An LLM-Based Prompt Compression Framework Considering Task-Specific Styles

Xiao Pu; Tianxing He; Xiaojun Wan

doi:10.18653/v1/2024.findings-emnlp.851

Style-Compress: An LLM-Based Prompt Compression Framework Considering Task-Specific Styles

Abstract

Prompt compression condenses contexts while maintaining their informativeness for different usage scenarios. It not only shortens the inference time and reduces computational costs during the usage of large language models, but also lowers expenses when using closed-source models. In a preliminary study, we discover that when instructing language models to compress prompts, different compression styles (e.g., extractive or abstractive) impact performance of compressed prompts on downstream tasks. Building on this insight, we propose Style-Compress, a lightweight framework that adapts a smaller language model to compress prompts for a larger model on a new task without additional training. Our approach iteratively generates and selects effective compressed prompts as task-specific demonstrations through style variation and in-context learning, enabling smaller models to act as efficient compressors with task-specific examples. Style-Compress outperforms two baseline compression models in four tasks: original prompt reconstruction, text summarization, multi-hop QA, and CoT reasoning. In addition, with only 10 samples and 100 queries for adaptation, prompts compressed by Style-Compress achieve performance on par with or better than original prompts at a compression ratio of 0.25 or 0.5.

Anthology ID:: 2024.findings-emnlp.851
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2024
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 14533–14549
Language:
URL:: https://aclanthology.org/2024.findings-emnlp.851/
DOI:: 10.18653/v1/2024.findings-emnlp.851
Bibkey:
Cite (ACL):: Xiao Pu, Tianxing He, and Xiaojun Wan. 2024. Style-Compress: An LLM-Based Prompt Compression Framework Considering Task-Specific Styles. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 14533–14549, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: Style-Compress: An LLM-Based Prompt Compression Framework Considering Task-Specific Styles (Pu et al., Findings 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.findings-emnlp.851.pdf

PDF Cite Search Fix data