Junbo Fu


2024

pdf bib
Learning to Paraphrase for Alignment with LLM Preference
Junbo Fu | Guoshuai Zhao | Yimin Deng | Yunqi Mi | Xueming Qian
Findings of the Association for Computational Linguistics: EMNLP 2024

Large Language Models (LLMs) exhibit the issue of paraphrase divergence. This means that when a question is phrased in a slightly different but semantically similar way, LLM may output a wrong response despite being able to answer the original question correctly. Previous research has regarded this issue as a problem of the model’s robustness to question paraphrase and proposed a retraining method to address it. However, retraining faces challenges in meeting the computational costs and privacy security demands of LLMs. In this paper, we regard this issue as a problem of alignment with model preferences and propose PEARL (Preference-drivEn pAraphRase Learning). This is a black-box method that enhances model performance by paraphrasing questions in expressions preferred by the model. We validate PEARL across six datasets spanning three tasks: open-domain QA, commonsense reasoning, and math word problem. Extensive experiments demonstrated not only the outstanding performance but also the composability, transferability, and immense potential of PEARL, shedding new light on the black-box tuning of LLMs.