Haopeng Bai
2023
Conversational Recommender System and Large Language Model Are Made for Each Other in E-commerce Pre-sales Dialogue
Yuanxing Liu | Weinan Zhang | Yifan Chen | Yuchi Zhang | Haopeng Bai | Fan Feng | Hengbin Cui | Yongbin Li | Wanxiang Che
Findings of the Association for Computational Linguistics: EMNLP 2023
Yuanxing Liu | Weinan Zhang | Yifan Chen | Yuchi Zhang | Haopeng Bai | Fan Feng | Hengbin Cui | Yongbin Li | Wanxiang Che
Findings of the Association for Computational Linguistics: EMNLP 2023
E-commerce pre-sales dialogue aims to understand and elicit user needs and preferences for the items they are seeking so as to provide appropriate recommendations. Conversational recommender systems (CRSs) learn user representation and provide accurate recommendations based on dialogue context, but rely on external knowledge. Large language models (LLMs) generate responses that mimic pre-sales dialogues after fine-tuning, but lack domain-specific knowledge for accurate recommendations. Intuitively, the strengths of LLM and CRS in E-commerce pre-sales dialogues are complementary, yet no previous work has explored this. This paper investigates the effectiveness of combining LLM and CRS in E-commerce pre-sales dialogues, proposing two collaboration methods: CRS assisting LLM and LLM assisting CRS. We conduct extensive experiments on a real-world dataset of E-commerce pre-sales dialogues. We analyze the impact of two collaborative approaches with two CRSs and two LLMs on four tasks of E-commerce pre-sales dialogue. We find that collaborations between CRS and LLM can be very effective in some cases.
Through the Lens of Core Competency: Survey on Evaluation of Large Language Models
Ziyu Zhuang | Qiguang Chen | Longxuan Ma | Mingda Li | Yi Han | Yushan Qian | Haopeng Bai | Weinan Zhang | Ting Liu
Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 2: Frontier Forum)
Ziyu Zhuang | Qiguang Chen | Longxuan Ma | Mingda Li | Yi Han | Yushan Qian | Haopeng Bai | Weinan Zhang | Ting Liu
Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 2: Frontier Forum)
“From pre-trained language model (PLM) to large language model (LLM), the field of naturallanguage processing (NLP) has witnessed steep performance gains and wide practical uses. Theevaluation of a research field guides its direction of improvement. However, LLMs are extremelyhard to thoroughly evaluate for two reasons. First of all, traditional NLP tasks become inade-quate due to the excellent performance of LLM. Secondly, existing evaluation tasks are difficultto keep up with the wide range of applications in real-world scenarios. To tackle these problems,existing works proposed various benchmarks to better evaluate LLMs. To clarify the numerousevaluation tasks in both academia and industry, we investigate multiple papers concerning LLMevaluations. We summarize 4 core competencies of LLM, including reasoning, knowledge, relia-bility, and safety. For every competency, we introduce its definition, corresponding benchmarks,and metrics. Under this competency architecture, similar tasks are combined to reflect corre-sponding ability, while new tasks can also be easily added into the system. Finally, we give oursuggestions on the future direction of LLM’s evaluation.”