Yunzhe Qi


2025

pdf bib
Learning to Instruct: Fine-Tuning a Task-Aware Instruction Optimizer for Black-Box LLMs
Yunzhe Qi | Jinjin Tian | Tianci Liu | Ruirui Li | Tianxin Wei | Hui Liu | Xianfeng Tang | Monica Xiao Cheng | Jingrui He
Findings of the Association for Computational Linguistics: EMNLP 2025

The performance of Large Language Models (LLMs) critically depends on designing effective instructions, which is particularly challenging for black-box LLMs with inaccessible internal states. To this end, we introduce Learning to Instruct, a novel paradigm that formulates instruction optimization as an LLM fine-tuning objective for a white-box “instruction engineer” LLM, leveraging its rich learning capacity and vast pre-trained knowledge to enable efficient and effective instruction optimization. Within this paradigm, we propose Automatic Instruction Optimizer (AIO), a novel framework that fine-tunes a white-box LLM into a capable instruction engineer. AIO learns to optimize task-aware, human-comprehensible instructions by incorporating task nuances and feedback from the task-solving black-box LLM. To overcome the challenges of inaccessible black-box gradients and high API costs, AIO introduces a novel zeroth-order (ZO) gradient approximation mechanism guided by Thompson Sampling (TS), which reuses informative black-box LLM feedback for improved query efficiency. Extensive experiments show that AIO generally outperforms strong baselines in both effectiveness and efficiency, establishing Learning to Instruct as a promising new direction for black-box LLM instruction optimization.