AgentOCR: Reimagining Agent History via Optical Self-Compression

Lang Feng; Fuchao Yang; Feng Chen; Xin Cheng; Haiyang Xu; Zhenglin Wan; Ming Yan; Bo An

AgentOCR: Reimagining Agent History via Optical Self-Compression

Lang Feng, Fuchao Yang, Feng Chen, Xin Cheng, Haiyang Xu, Zhenglin Wan, Ming Yan, Bo An

Abstract

Recent advances in large language models (LLMs) enable agentic systems trained with reinforcement learning (RL) over multi-turn interaction, but practical deployment is bottlenecked by rapidly growing textual histories that inflate token and memory costs. We introduce AgentOCR, a framework that exploits visual tokens’ superior information density by representing the accumulated observation-action history as a compact rendered image. To make multi-turn rollouts scalable, AgentOCR proposes segment optical caching. By decomposing history into hashable segments and maintaining a visual cache, this mechanism eliminates redundant re-rendering. Beyond fixed rendering, AgentOCR introduces agentic self-compression, where the agent actively emits a compression rate and is trained with compression-aware reward to adaptively balance task success and token efficiency. We conduct extensive experiments on challenging agentic benchmarks, ALFWorld and search-based QA. Remarkably, AgentOCR preserves over 95% of text-based agent performance while substantially reducing token consumption (>50%), yielding consistent token and memory efficiency. Further analysis validates a 20× rendering speedup from optical caching and effective self-compression balancing. Our code is available at https://github.com/langfengQ/AgentOCR.

Anthology ID:: 2026.acl-long.230
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 5067–5086
Language:
URL:: https://aclanthology.org/2026.acl-long.230/
DOI:
Bibkey:
Cite (ACL):: Lang Feng, Fuchao Yang, Feng Chen, Xin Cheng, Haiyang Xu, Zhenglin Wan, Ming Yan, and Bo An. 2026. AgentOCR: Reimagining Agent History via Optical Self-Compression. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5067–5086, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: AgentOCR: Reimagining Agent History via Optical Self-Compression (Feng et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-long.230.pdf
Checklist:: 2026.acl-long.230.checklist.pdf

PDF Cite Search Checklist Fix data