Memorize Step by Step: Efficient Long-Context Prefilling with Incremental Memory and Decremental Chunk

Memorize Step by Step: Efficient Long-Context Prefilling with Incremental Memory and Decremental Chunk Zhiyuan Zeng author Qipeng Guo author Xiaoran Liu author Zhangyue Yin author Wentao Shu author Mianqiu Huang author Bo Wang author Yunhua Zhou author Linlin Li author Qun Liu author Xipeng Qiu author 2024-11 text Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing Yaser Al-Onaizan editor Mohit Bansal editor Yun-Nung Chen editor Association for Computational Linguistics Miami, Florida, USA conference publication zeng-etal-2024-memorize 10.18653/v1/2024.emnlp-main.1169 https://aclanthology.org/2024.emnlp-main.1169/ 2024-11 21021 21034