Guoshuai Zhao
2024
Learning to Paraphrase for Alignment with LLM Preference
Junbo Fu
|
Guoshuai Zhao
|
Yimin Deng
|
Yunqi Mi
|
Xueming Qian
Findings of the Association for Computational Linguistics: EMNLP 2024
Large Language Models (LLMs) exhibit the issue of paraphrase divergence. This means that when a question is phrased in a slightly different but semantically similar way, LLM may output a wrong response despite being able to answer the original question correctly. Previous research has regarded this issue as a problem of the model’s robustness to question paraphrase and proposed a retraining method to address it. However, retraining faces challenges in meeting the computational costs and privacy security demands of LLMs. In this paper, we regard this issue as a problem of alignment with model preferences and propose PEARL (Preference-drivEn pAraphRase Learning). This is a black-box method that enhances model performance by paraphrasing questions in expressions preferred by the model. We validate PEARL across six datasets spanning three tasks: open-domain QA, commonsense reasoning, and math word problem. Extensive experiments demonstrated not only the outstanding performance but also the composability, transferability, and immense potential of PEARL, shedding new light on the black-box tuning of LLMs.
Pseudo-Label Enhanced Prototypical Contrastive Learning for Uniformed Intent Discovery
Yimin Deng
|
Yuxia Wu
|
Guoshuai Zhao
|
Li Zhu
|
Xueming Qian
Findings of the Association for Computational Linguistics: EMNLP 2024
New intent discovery is a crucial capability for task-oriented dialogue systems. Existing methods focus on transferring in-domain (IND) prior knowledge to out-of-domain (OOD) data through pre-training and clustering stages. They either handle the two processes in a pipeline manner, which exhibits a gap between intent representation and clustering process or use typical contrastive clustering that overlooks the potential supervised signals from the whole data. Besides, they often deal with either open intent discovery or OOD settings individually. To this end, we propose a Pseudo-Label enhanced Prototypical Contrastive Learning (PLPCL) model for uniformed intent discovery. We iteratively utilize pseudo-labels to explore potential positive/negative samples for contrastive learning and bridge the gap between representation and clustering. To enable better knowledge transfer, we design a prototype learning method integrating the supervised and pseudo signals from IND and OOD samples. In addition, our method has been proven effective in two different settings of discovering new intents. Experiments on three benchmark datasets and two task settings demonstrate the effectiveness of our approach.
Search
Co-authors
- Yimin Deng 2
- Xueming Qian 2
- Junbo Fu 1
- Yunqi Mi 1
- Yuxia Wu 1
- show all...
- Li Zhu 1