LLM Prompt Duel Optimizer: Efficient Label-Free Prompt Optimization

Yuanchen Wu; Saurabh Verma; Justin Lee; Fangzhou Xiong; Poppy Zhang; Amel Awadelkarim; Xu Chen; Yubai Yuan; Shawndra Hill

LLM Prompt Duel Optimizer: Efficient Label-Free Prompt Optimization

Yuanchen Wu, Saurabh Verma, Justin Lee, Fangzhou Xiong, Poppy Zhang, Amel Awadelkarim, Xu Chen, Yubai Yuan, Shawndra Hill

Abstract

Large language models (LLMs) are highly sensitive to prompts, but most automatic prompt optimization (APO) methods assume access to ground-truth references (e.g., labeled validation data) that are costly to obtain. We propose the Prompt Duel Optimizer (PDO), a sample-efficient framework for label-free prompt optimization based on pairwise preference feedback from an LLM judge. PDO casts prompt selection as a dueling-bandit problem and combines (i) Double Thompson Sampling to prioritize informative comparisons under a fixed judge budget, with (ii) top-performer guided mutation to expand the candidate pool while pruning weak prompts. Experiments on BIG-bench Hard (BBH) and MS MARCO show that PDO consistently identifies stronger prompts than label-free baselines, while offering favorable quality–cost trade-offs under constrained comparison budgets.

Anthology ID:: 2026.findings-acl.490
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 10066–10089
Language:
URL:: https://aclanthology.org/2026.findings-acl.490/
DOI:
Bibkey:
Cite (ACL):: Yuanchen Wu, Saurabh Verma, Justin Lee, Fangzhou Xiong, Poppy Zhang, Amel Awadelkarim, Xu Chen, Yubai Yuan, and Shawndra Hill. 2026. LLM Prompt Duel Optimizer: Efficient Label-Free Prompt Optimization. In Findings of the Association for Computational Linguistics: ACL 2026, pages 10066–10089, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: LLM Prompt Duel Optimizer: Efficient Label-Free Prompt Optimization (Wu et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.490.pdf
Checklist:: 2026.findings-acl.490.checklist.pdf

PDF Cite Search Checklist Fix data