DORA: Dynamic Optimization Prompt for Continuous Reflection of LLM-based Agent

Kun Li; Tingzhang Zhao; Wei Zhou; Songlin Hu

DORA: Dynamic Optimization Prompt for Continuous Reflection of LLM-based Agent

Kun Li, Tingzhang Zhao, Wei Zhou, Songlin Hu

Abstract

Autonomous agents powered by large language models (LLMs) hold significant potential across various domains. The Reflection framework is designed to help agents learn from past mistakes in complex tasks. While previous research has shown that reflection can enhance performance, our investigation reveals a key limitation: meaningful self-reflection primarily occurs at the beginning of iterations, with subsequent attempts failing to produce further improvements. We term this phenomenon “Early Stop Reflection,” where the reflection process halts prematurely, limiting the agent’s ability to engage in continuous learning. To address this, we propose the DORA method (Dynamic and Optimized Reflection Advice), which generates task-adaptive and diverse reflection advice. DORA introduces an external open-source small language model (SLM) that dynamically generates prompts for the reflection LLM. The SLM uses feedback from the agent and optimizes the prompt generation process through a non-gradient Bayesian Optimization (BO) algorithm, ensuring the reflection process evolves and adapts over time. Our experiments in the MiniWoB++ and Alfworld environments confirm that DORA effectively mitigates the “Early Stop Reflection” issue, enabling agents to maintain iterative improvements and boost performance in long-term, complex tasks. Code are available at https://anonymous.4open.science/r/DORA-44FB/.

Anthology ID:: 2025.coling-main.504
Volume:: Proceedings of the 31st International Conference on Computational Linguistics
Month:: January
Year:: 2025
Address:: Abu Dhabi, UAE
Editors:: Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:: COLING
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 7546–7557
Language:
URL:: https://aclanthology.org/2025.coling-main.504/
DOI:
Bibkey:
Cite (ACL):: Kun Li, Tingzhang Zhao, Wei Zhou, and Songlin Hu. 2025. DORA: Dynamic Optimization Prompt for Continuous Reflection of LLM-based Agent. In Proceedings of the 31st International Conference on Computational Linguistics, pages 7546–7557, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):: DORA: Dynamic Optimization Prompt for Continuous Reflection of LLM-based Agent (Li et al., COLING 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.coling-main.504.pdf

PDF Cite Search Fix data