PIC: Unlocking Long-Form Text Generation Capabilities of Large Language Models via Position ID Compression

Haoran Que; Wenge Rong

doi:10.18653/v1/2025.acl-long.347

PIC: Unlocking Long-Form Text Generation Capabilities of Large Language Models via Position ID Compression

Abstract

Long-context understanding is crucial for large language models (LLMs) and has become a fundamental capability for most LLMs. However, beyond the focus on “input-long”, the ability to “output-long” is equally significant, yet it remains underexplored. To address this limitation, we propose a simple, efficient, and plug-in approach, Position ID Compression (PIC), to unlock the long-form text generation potential of LLMs. The idea is straightforward: by compressing the position ids of the context, we provoke and guide LLMs to generate coherent and longer output. Specifically, we find that directly reducing the position ids by a fixed ratio significantly impacts the generation quality. To mitigate this, we propose two variants of PIC: NTK-aware PIC and Dynamic PIC. Without additional training, both methods enable LLMs to extend their generation length by approximately 1.5 times without compromising generation quality. Furthermore, by integrating supervised fine-tuning (SFT) with PIC, we propose PIC-SFT, which further improves LLMs’ long-form text generation capabilities, achieving top performance on HelloBench and LongBench-Write. Extensive experiments demonstrate the effectiveness of our approach.

Anthology ID:: 2025.acl-long.347
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 6982–6995
Language:
URL:: https://aclanthology.org/2025.acl-long.347/
DOI:: 10.18653/v1/2025.acl-long.347
Bibkey:
Cite (ACL):: Haoran Que and Wenge Rong. 2025. PIC: Unlocking Long-Form Text Generation Capabilities of Large Language Models via Position ID Compression. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6982–6995, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: PIC: Unlocking Long-Form Text Generation Capabilities of Large Language Models via Position ID Compression (Que & Rong, ACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.acl-long.347.pdf

PDF Cite Search Fix data