Yuchi Zhang


2023

pdf bib
Conversational Recommender System and Large Language Model Are Made for Each Other in E-commerce Pre-sales Dialogue
Yuanxing Liu | Weinan Zhang | Yifan Chen | Yuchi Zhang | Haopeng Bai | Fan Feng | Hengbin Cui | Yongbin Li | Wanxiang Che
Findings of the Association for Computational Linguistics: EMNLP 2023

E-commerce pre-sales dialogue aims to understand and elicit user needs and preferences for the items they are seeking so as to provide appropriate recommendations. Conversational recommender systems (CRSs) learn user representation and provide accurate recommendations based on dialogue context, but rely on external knowledge. Large language models (LLMs) generate responses that mimic pre-sales dialogues after fine-tuning, but lack domain-specific knowledge for accurate recommendations. Intuitively, the strengths of LLM and CRS in E-commerce pre-sales dialogues are complementary, yet no previous work has explored this. This paper investigates the effectiveness of combining LLM and CRS in E-commerce pre-sales dialogues, proposing two collaboration methods: CRS assisting LLM and LLM assisting CRS. We conduct extensive experiments on a real-world dataset of E-commerce pre-sales dialogues. We analyze the impact of two collaborative approaches with two CRSs and two LLMs on four tasks of E-commerce pre-sales dialogue. We find that collaborations between CRS and LLM can be very effective in some cases.

pdf bib
A Two-Stage Progressive Intent Clustering for Task-Oriented Dialogue
Bingzhu Du | Nan Su | Yuchi Zhang | Yongliang Wang
Proceedings of The Eleventh Dialog System Technology Challenge

Natural Language Understanding (NLU) is one of the most critical components of task-oriented dialogue, and it is often considered as an intent classification task. To achieve outstanding intent identification performance, system designers often need to hire a large number of domain experts to label the data, which is inefficient and costly. To address this problem, researchers’ attention has gradually shifted to automatic intent clustering methods, which employ low-resource unsupervised approaches to solve classification problems. The classical framework for clustering is deep clustering, which uses deep neural networks (DNNs) to jointly optimize non-clustering loss and clustering loss. However, for new conversational domains or services, utterances required to assign intents are scarce and the performance of DNNs is often dependent on large amounts of data. In addition, although re-clustering with k-means algorithm after training the network usually leads to better results, k-means methods often suffer from poor stability. To address these problems, we propose an effective two-stage progressive approach to refine the clustering. Firstly, we pre-train the network with contrastive loss using all conversations data and then optimize the clustering loss and contrastive loss simultaneously. Secondly, we propose adaptive progressive k-means to alleviate the randomness of vanilla k-means, achieving better performance and smaller deviation. Our method ranks second in DSTC11 Track2 Task 1, a benchmark for intent clustering of task-oriented dialogue, demonstrating the superiority and effectiveness of our method.

2020

pdf bib
Selection and Generation: Learning towards Multi-Product Advertisement Post Generation
Zhangming Chan | Yuchi Zhang | Xiuying Chen | Shen Gao | Zhiqiang Zhang | Dongyan Zhao | Rui Yan
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

As the E-commerce thrives, high-quality online advertising copywriting has attracted more and more attention. Different from the advertising copywriting for a single product, an advertisement (AD) post includes an attractive topic that meets the customer needs and description copywriting about several products under its topic. A good AD post can highlight the characteristics of each product, thus helps customers make a good choice among candidate products. Hence, multi-product AD post generation is meaningful and important. We propose a novel end-to-end model named S-MG Net to generate the AD post. Targeted at such a challenging real-world problem, we split the AD post generation task into two subprocesses: (1) select a set of products via the SelectNet (Selection Network). (2) generate a post including selected products via the MGenNet (Multi-Generator Network). Concretely, SelectNet first captures the post topic and the relationship among the products to output the representative products. Then, MGenNet generates the description copywriting of each product. Experiments conducted on a large-scale real-world AD post dataset demonstrate that our proposed model achieves impressive performance in terms of both automatic metrics as well as human evaluations.