The advent of large language models (LLMs) like GPT-4 has catalyzed the exploration of multi-task learning (MTL), in which a single model demonstrates proficiency across diverse tasks. Task arithmetic has emerged as a cost-effective approach for MTL. It enables performance enhancement across multiple tasks by adding their corresponding task vectors to a pre-trained model. However, the current lack of a method that can simultaneously achieve optimal performance, computational efficiency, and data privacy limits their application to LLMs. In this paper, we propose Model Exclusive Task Arithmetic for merging GPT-scale models (MetaGPT) which formalizes the objective of model merging into a multi-task learning framework, aiming to minimize the average loss difference between the merged model and each individual task model. Since data privacy limits the use of multi-task training data, we leverage LLMs’ local linearity and task vectors’ orthogonality to separate the data term and scaling coefficients term and derive a model-exclusive task arithmetic method. Our proposed MetaGPT is data-agnostic and bypasses the heavy search process, making it cost-effective and easy to implement for LLMs. Extensive experiments demonstrate that MetaGPT leads to improvement of task arithmetic and achieves state-of-the-art performance on multiple tasks.
Ongoing chatting is an important step for conversational agents to build long-term connections with people. However, people tend to quickly lose interest in chatting if the conversational agent’s words are not engaging enough. In this paper, we present a novel task of increasing users’ willingness to continue talking to the agent.We collect a dataset named ContinuousChat by: (i) collecting personas and revising them, and then expanding the personas to detailed-personas through experiences, daily life, future plans, or interesting stories; (ii) expanding detailed-personas into the dialogues, and inject emotions and feelings into them; (iii) rewriting the dialogues in specific styles through few-shot prompt, conditioning on handwritten style-specific examples.We benchmark LLMs on ContinuousChat Dataset using both fine-tuning and in-context learning settings. Experiments over publicly available models demonstrate that although there is substantial room for improvement in generating style-specific dialogues, our ContinuousChat dataset is valuable in guiding conversational agents to generate more attractive dialogues and increase users’ willingness to continue the conversations.
Synthetic data has been proposed as a solution to address the issue of high-quality data scarcity in the training of large language models (LLMs). Studies have shown that synthetic data can effectively improve the performance of LLMs on downstream benchmarks. However, despite its potential benefits, our analysis suggests that there may be inherent flaws in synthetic data. The uniform format of synthetic data can lead to pattern overfitting and cause significant shifts in the output distribution, thereby reducing the model’s instruction-following capabilities. Our work delves into these specific flaws associated with question-answer (Q-A) pairs, a prevalent type of synthetic data, and presents a method based on unlearning techniques to mitigate these flaws. The empirical results demonstrate the effectiveness of our approach, which can reverse the instruction-following issues caused by pattern overfitting without compromising performance on benchmarks at relatively low cost. Our work has yielded key insights into the effective use of synthetic data, aiming to promote more robust and efficient LLM training.