Guidelines as Environments: A World Model Approach to Rule Following

Haiqing Li; Wenliang Zhong; Yinhao Wu; Hehuan Ma; Yuzhi Guo; Thao M. Dang; Junzhou Huang

doi:10.18653/v1/2026.acl-long.741

Guidelines as Environments: A World Model Approach to Rule Following

Haiqing Li, Wenliang Zhong, Yinhao Wu, Hehuan Ma, Yuzhi Guo, Thao M. Dang, Junzhou Huang

Abstract

Guideline-following is increasingly important in compliance, customer support, and other regulated workflows, where correctness is defined by explicit rule systems rather than heuristics. Learning to follow guidelines is challenging because guidelines are interdependent: rules can trigger, suppress, or conflict with one another, while locally plausible responses may violate global constraints. Most existing methods treat guidelines as static text and rely on implicit reasoning or deeper decoding, making rule interactions and satisfaction status hard to observe and control. A more feasible approach is to model guideline execution with an explicit state that tracks evolving rule evidence across steps. However, conventional world models are a poor fit: they typically assume privileged feedback or well-defined transition dynamics, assumptions that do not hold when reasoning occurs purely in language space under ambiguous, text-defined constraints. As a solution, we propose RGCWM, a Rule-Grounded Causal World Model that builds an explicit state space from the guideline text itself. RGCWM represents rule applicability and satisfaction as a continuously updated evidence state, externalizes inter-rule dependencies as a causal structure, and plans at inference time by counterfactually evaluating candidate responses under model-estimated state transitions. Experiments show that this shift from implicit text reasoning to state-based reasoning enables stable, controllable execution of complex interacting rules across diverse domains.

Anthology ID:: 2026.acl-long.741
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 16302–16318
Language:
URL:: https://aclanthology.org/2026.acl-long.741/
DOI:: 10.18653/v1/2026.acl-long.741
Bibkey:
Cite (ACL):: Haiqing Li, Wenliang Zhong, Yinhao Wu, Hehuan Ma, Yuzhi Guo, Thao M. Dang, and Junzhou Huang. 2026. Guidelines as Environments: A World Model Approach to Rule Following. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 16302–16318, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Guidelines as Environments: A World Model Approach to Rule Following (Li et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-long.741.pdf
Checklist:: 2026.acl-long.741.checklist.pdf

PDF Cite Search Checklist Fix data