Songlin He


2025

pdf bib
LLM-based Open Domain Planning by Leveraging Entity-Attribute-Level Domain Models
Dongning Rao | Songlin He | Zhihua Jiang | Ruishi Liang
Findings of the Association for Computational Linguistics: EMNLP 2025

Currently, large language models (LLMs) based Open domain Natural language planning (LONG) has considerable room for improvement. E.g., non-reusable plans with incomplete intermediate states and missing steps hinder real-world applications. To remedy these flaws, this paper establishes a dataset with a baseline for LONG. The GOLD dataset provides the largest dataset for textual procedures, along with corresponding reusable formal planning domain definitions, to date. The baseline, DIGGER, leverages entity-attribute-level action models, which reveal relevant implicit physical properties (aka attributes) of salient entities in actions. DIGGER first extracts action models and builds typed entity lists from textual procedures. Then, it builds goal states for new tasks and instantiates grounded actions using domain prediction. At last, plans are generalized and translated into textual procedures by LLM. Reference-based metrics, LLM-as-a-Judge, and human evaluation are employed to comprehensively evaluate LONG. Experiments on GOLD validate that DIGGER is stronger and more generalizable than recently proposed approaches and LLMs. I.e., DIGGER is the best in seen domains and applicable to unseen domains without adaptation. Specifically, the BLEU-1 score increased from 0.385 to 0.408 on seen domains and rose to 0.310 on unseen domains.