RaDA: Retrieval-augmented Web Agent Planning with LLMs

Minsoo Kim, Victor Bursztyn, Eunyee Koh, Shunan Guo, Seung-won Hwang


Abstract
Agents powered by large language models (LLMs) inherit important limitations, such as the restricted context length, dependency on human-engineered exemplars (e.g., for task decomposition), and insufficient generalization. To address these challenges, we propose RaDA, a novel planning method for Web agents that does not require manual exemplars, efficiently leverages the LLMs’ context, and enhances generalization. RaDA disentangles planning into two stages: for a new given task, during Retrieval-augmented Task Decomposition (RaD), it decomposes tasks into high-level subtasks; next, during Retrieval-augmented Action Generation (RaA), it traverses the trajectory obtained with RaD to iteratively synthesize actions based on dynamically retrieved exemplars. We compare RaDA with strong baselines covering a broad space of design choices, using both GPT-3.5 and GPT-4 as backbones; and we find consistent improvements over previous SOTA in two challenging benchmarks, CompWoB and Mind2Web, covering settings with different complexities. We show the contributions of RaDA via ablation studies and qualitative analysis; and we discuss the structural benefits of our more compositional design.
Anthology ID:
2024.findings-acl.802
Volume:
Findings of the Association for Computational Linguistics: ACL 2024
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
13511–13525
Language:
URL:
https://aclanthology.org/2024.findings-acl.802
DOI:
10.18653/v1/2024.findings-acl.802
Bibkey:
Cite (ACL):
Minsoo Kim, Victor Bursztyn, Eunyee Koh, Shunan Guo, and Seung-won Hwang. 2024. RaDA: Retrieval-augmented Web Agent Planning with LLMs. In Findings of the Association for Computational Linguistics: ACL 2024, pages 13511–13525, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
RaDA: Retrieval-augmented Web Agent Planning with LLMs (Kim et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-acl.802.pdf