Yuchong Sun


2024

pdf bib
BSharedRAG: Backbone Shared Retrieval-Augmented Generation for the E-commerce Domain
Kaisi Guan | Qian Cao | Yuchong Sun | Xiting Wang | Ruihua Song
Findings of the Association for Computational Linguistics: EMNLP 2024

Retrieval Augmented Generation (RAG) system is important in domains such as e-commerce, which has many long-tail entities and frequently updated information. Most existing works adopt separate modules for retrieval and generation, which may be suboptimal since the retrieval task and the generation task cannot benefit from each other to improve performance. We propose a novel Backbone Shared RAG framework (BSharedRAG). It first uses a domain-specific corpus to continually pre-train a base model as a domain-specific backbone model and then trains two plug-and-play Low-Rank Adaptation (LoRA) modules based on the shared backbone to minimize retrieval and generation losses respectively. Experimental results indicate that our proposed BSharedRAG outperforms baseline models by 5% and 13% in Hit@3 upon two datasets in retrieval evaluation and by 23% in terms of BLEU-3 in generation evaluation. Our codes, models, and dataset are available at https://bsharedrag.github.io.

pdf bib
Parrot: Enhancing Multi-Turn Instruction Following for Large Language Models
Yuchong Sun | Che Liu | Kun Zhou | Jinwen Huang | Ruihua Song | Xin Zhao | Fuzheng Zhang | Di Zhang | Kun Gai
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Humans often interact with large language models (LLMs) in multi-turn interaction to obtain desired answers or more information. However, most existing studies overlook the multi-turn instruction following ability of LLMs, in terms of training dataset, training method, and evaluation benchmark. In this paper, we introduce Parrot, a solution aiming to enhance multi-turn instruction following for LLMs. First, we introduce an efficient but effective method for collecting multi-turn instructions that feature human-like queries, such as anaphora and ellipsis. Second, we propose a context-aware preference optimization strategy to further enhance LLMs for complex queries in multi-turn interaction. Moreover, to quantitatively evaluate LLMs in multi-turn instruction following, we manually build a multi-turn benchmark derived from existing ones. Extensive experiments show that Parrot improves current LLMs by up to 7.2% in multi-turn instruction following. Our dataset and codes will be open-sourced to facilitate future research.

2023

pdf bib
Joint Semantic and Strategy Matching for Persuasive Dialogue
Chuhao Jin | Yutao Zhu | Lingzhen Kong | Shijie Li | Xiao Zhang | Ruihua Song | Xu Chen | Huan Chen | Yuchong Sun | Yu Chen | Jun Xu
Findings of the Association for Computational Linguistics: EMNLP 2023

Persuasive dialogue aims to persuade users to achieve some targets by conversations. While previous persuasion models have achieved notable successes, they mostly base themselves on utterance semantic matching, and an important aspect has been ignored, that is, the strategy of the conversations, for example, the agent can choose an emotional-appeal strategy to impress users. Compared with utterance semantics, conversation strategies are high-level concepts, which can be informative and provide complementary information to achieve effective persuasions. In this paper, we propose to build a persuasion model by jointly modeling the conversation semantics and strategies, where we design a BERT-like module and an auto-regressive predictor to match the semantics and strategies, respectively. Experimental results indicate that our proposed approach can significantly improve the state-of-the-art baseline by 5% on a small dataset and 37% on a large dataset in terms of Recall@1. Detailed analyses show that the auto-regressive predictor contributes most to the final performance.