On the Empirical Complexity of Reasoning and Planning in LLMs

Liwei Kang, Zirui Zhao, David Hsu, Wee Sun Lee


Abstract
Chain-of-thought (CoT), tree-of-thought (ToT), and related techniques work surprisingly well in practice for some complex reasoning tasks with Large Language Models (LLMs), but why? This work seeks the underlying reasons by conducting experimental case studies and linking the performance benefits to well-established sample and computational complexity principles in machine learning. We experimented with six reasoning tasks, ranging from grade school math, air travel planning, ..., to Blocksworld. The results suggest that (i) both CoT and ToT benefit significantly from task decomposition, which breaks a complex reasoning task into a sequence of steps with low sample complexity and explicitly outlines the reasoning structure; (ii) for computationally hard reasoning tasks, the more sophisticated tree structure of ToT outperforms the linear structure of CoT; (iii) explicitly annotating important variables is important for good performance. These findings provide useful guidelines for using LLM in solving reasoning tasks in practice.
Anthology ID:
2024.findings-emnlp.164
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2024
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2897–2936
Language:
URL:
https://aclanthology.org/2024.findings-emnlp.164
DOI:
10.18653/v1/2024.findings-emnlp.164
Bibkey:
Cite (ACL):
Liwei Kang, Zirui Zhao, David Hsu, and Wee Sun Lee. 2024. On the Empirical Complexity of Reasoning and Planning in LLMs. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 2897–2936, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
On the Empirical Complexity of Reasoning and Planning in LLMs (Kang et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-emnlp.164.pdf