Xueyang Feng
2025
Improving Retrospective Language Agents via Joint Policy Gradient Optimization
Xueyang Feng
|
Bo Lan
|
Quanyu Dai
|
Lei Wang
|
Jiakai Tang
|
Xu Chen
|
Zhenhua Dong
|
Ji-Rong Wen
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
In recent research advancements within the community, large language models (LLMs) have sparked great interest in creating autonomous agents. However, current prompt-based agents often heavily rely on large-scale LLMs. Meanwhile, although fine-tuning methods significantly enhance the capabilities of smaller LLMs, the fine-tuned agents often lack the potential for self-reflection and self-improvement. To address these challenges, we introduce a novel agent framework named RetroAct, which is a framework that jointly optimizes both task-planning and self-reflective evolution capabilities in language agents. Specifically, we develop a two-stage joint optimization process that integrates imitation learning and reinforcement learning, and design an off-policy joint policy gradient optimization algorithm with imitation learning regularization to enhance the data efficiency and training stability in agent tasks. RetroAct significantly improves the performance of open-source models, reduces dependency on closed-source LLMs, and enables fine-tuned agents to learn and evolve continuously. We conduct extensive experiments across various testing environments, demonstrating RetroAct has substantial improvements in task performance and decision-making processes.
2024
Large Language Model-based Human-Agent Collaboration for Complex Task Solving
Xueyang Feng
|
Zhi-Yuan Chen
|
Yujia Qin
|
Yankai Lin
|
Xu Chen
|
Zhiyuan Liu
|
Ji-Rong Wen
Findings of the Association for Computational Linguistics: EMNLP 2024
In recent developments within the research community, the integration of Large Language Models (LLMs) in creating fully autonomous agents has garnered significant interest. Despite this, LLM-based agents frequently demonstrate notable shortcomings in adjusting to dynamic environments and fully grasping human needs. In this work, we introduce the problem of LLM-based human-agent collaboration for complex task-solving, exploring their synergistic potential. To tackle the problem, we propose a Reinforcement Learning-based Human-Agent Collaboration method, ReHAC, which trains a policy model designed to determine the most opportune stages for human intervention within the task-solving process. We conduct experiments under real and simulated human-agent collaboration scenarios. Experimental results demonstrate that the synergistic efforts of humans and LLM-based agents significantly improve performance in complex tasks, primarily through well-planned, limited human intervention. Datasets and code are available at: https://github.com/XueyangFeng/ReHAC/.
Search
Fix data
Co-authors
- Xu Chen (徐晨) 2
- Ji-Rong Wen 2
- Zhi-Yuan Chen 1
- Quanyu Dai 1
- Zhenhua Dong 1
- show all...