E2CL: Exploration-based Error Correction Learning for Embodied Agents

Hanlin Wang, Chak Tou Leong, Jian Wang, Wenjie Li


Abstract
Language models are exhibiting increasing capability in knowledge utilization and reasoning. However, when applied as agents in embodied environments, they often suffer from misalignment between their intrinsic knowledge and environmental knowledge, leading to infeasible actions. Traditional environment alignment methods, such as supervised learning on expert trajectories and reinforcement learning, encounter limitations in covering environmental knowledge and achieving efficient convergence, respectively. Inspired by human learning, we propose Exploration-based Error Correction Learning (E2CL), a novel framework that leverages exploration-induced errors and environmental feedback to enhance environment alignment for embodied agents. E2CL incorporates teacher-guided and teacher-free explorations to gather environmental feedback and correct erroneous actions. The agent learns to provide feedback and self-correct, thereby enhancing its adaptability to target environments. Extensive experiments in the VirtualHome environment demonstrate that E2CL-trained agents outperform those trained by baseline methods and exhibit superior self-correction capabilities.
Anthology ID:
2024.findings-emnlp.448
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2024
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7626–7639
Language:
URL:
https://aclanthology.org/2024.findings-emnlp.448
DOI:
Bibkey:
Cite (ACL):
Hanlin Wang, Chak Tou Leong, Jian Wang, and Wenjie Li. 2024. E2CL: Exploration-based Error Correction Learning for Embodied Agents. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 7626–7639, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
E2CL: Exploration-based Error Correction Learning for Embodied Agents (Wang et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-emnlp.448.pdf