From Pixels to Policies: Reinforcing Spatial Reasoning in Language Models for Content-Aware Layout Design

Sha Li; Stefano Petrangeli; Yu Shen; Xiang Chen

From Pixels to Policies: Reinforcing Spatial Reasoning in Language Models for Content-Aware Layout Design

Sha Li, Stefano Petrangeli, Yu Shen, Xiang Chen

Abstract

We introduce LaySPA, a reinforcement learning framework that equips large language models (LLMs) with explicit and interpretable spatial reasoning for content-aware graphic layout design. LaySPA addresses two key challenges: LLMs’ limited spatial reasoning and the lack of transparency in design decision making. Instead of operating at the pixel level, we reformulate layout design as a policy learning problem over a structured textual spatial environment that explicitly encodes canvas geometry, element attributes, and inter-element relationships. LaySPA produces dual-level outputs comprising interpretable reasoning traces and structured layout specifications, enabling transparent and controllable design decision making. Layout design policy is optimized via a multi-objective spatial critique that decomposes layout quality into geometric validity, relational coherence, and aesthetic consistency, and is trained using relative group optimization to stabilize learning in open-ended design spaces. Experiments demonstrate that LaySPA improves structural validity and visual quality, outperforming larger proprietary LLMs and achieving performance comparable to specialized state-of-the-art layout generators while requiring fewer annotated samples.

Anthology ID:: 2026.acl-industry.104
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Month:: July
Year:: 2026
Address:: San Diego, California, USA
Editors:: Yunyao Li, Georg Rehm, Mei Tu
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1509–1518
Language:
URL:: https://aclanthology.org/2026.acl-industry.104/
DOI:
Bibkey:
Cite (ACL):: Sha Li, Stefano Petrangeli, Yu Shen, and Xiang Chen. 2026. From Pixels to Policies: Reinforcing Spatial Reasoning in Language Models for Content-Aware Layout Design. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026), pages 1509–1518, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):: From Pixels to Policies: Reinforcing Spatial Reasoning in Language Models for Content-Aware Layout Design (Li et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-industry.104.pdf

PDF Cite Search Fix data