Can Large Language Models Reason About Goal-Oriented Tasks?

Filippos Bellos, Yayuan Li, Wuao Liu, Jason Corso


Abstract
Most adults can complete a sequence of steps to achieve a certain goal, such as making a sandwich or repairing a bicycle tire. In completing these goal-oriented tasks, or simply tasks in this paper, one must use sequential reasoning to understand the relationship between the sequence of steps and the goal. LLMs have shown impressive capabilities across various natural language understanding tasks. However, prior work has mainlyfocused on logical reasoning tasks (e.g. arithmetic, commonsense QA); how well LLMs can perform on more complex reasoning tasks like sequential reasoning is not clear. In this paper, we address this gap and conduct a comprehensive evaluation of how well LLMs are able to conduct this reasoning for tasks and how they scale w.r.t multiple dimensions(e.g. adaptive prompting strategies, number of in-context examples, varying complexity of the sequential task). Our findings reveal that while Chain of Thought (CoT) prompting can significantly enhance LLMs’ sequential reasoning in certain scenarios, it can also be detrimental in others, whereas Tree of Thoughts (ToT) reasoning is less effective for this type of task. Additionally, we discover that an increase in model size or in-context examples does not consistently lead to improved performance.
Anthology ID:
2024.scalellm-1.3
Volume:
Proceedings of the First edition of the Workshop on the Scaling Behavior of Large Language Models (SCALE-LLM 2024)
Month:
March
Year:
2024
Address:
St. Julian’s, Malta
Editors:
Antonio Valerio Miceli-Barone, Fazl Barez, Shay Cohen, Elena Voita, Ulrich Germann, Michal Lukasik
Venues:
SCALE-LLM | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
24–34
Language:
URL:
https://aclanthology.org/2024.scalellm-1.3
DOI:
Bibkey:
Cite (ACL):
Filippos Bellos, Yayuan Li, Wuao Liu, and Jason Corso. 2024. Can Large Language Models Reason About Goal-Oriented Tasks?. In Proceedings of the First edition of the Workshop on the Scaling Behavior of Large Language Models (SCALE-LLM 2024), pages 24–34, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):
Can Large Language Models Reason About Goal-Oriented Tasks? (Bellos et al., SCALE-LLM-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.scalellm-1.3.pdf