2024
pdf
bib
abs
EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models
Peng Wang
|
Ningyu Zhang
|
Bozhong Tian
|
Zekun Xi
|
Yunzhi Yao
|
Ziwen Xu
|
Mengru Wang
|
Shengyu Mao
|
Xiaohan Wang
|
Siyuan Cheng
|
Kangwei Liu
|
Yuansheng Ni
|
Guozhou Zheng
|
Huajun Chen
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Large Language Models (LLMs) usually suffer from knowledge cutoff or fallacy issues, which means they are unaware of unseen events or generate text with incorrect facts owing to outdated/noisy data. To this end, many knowledge editing approaches for LLMs have emerged – aiming to subtly inject/edit updated knowledge or adjust undesired behavior while minimizing the impact on unrelated inputs. Nevertheless, due to significant differences among various knowledge editing methods and the variations in task setups, there is no standard implementation framework available for the community, which hinders practitioners from applying knowledge editing to applications. To address these issues, we propose EasyEdit, an easy-to-use knowledge editing framework for LLMs. It supports various cutting-edge knowledge editing approaches and can be readily applied to many well-known LLMs such as T5, GPT-J, LlaMA, etc. Empirically, we report the knowledge editing results on LlaMA-2 with EasyEdit, demonstrating that knowledge editing surpasses traditional fine-tuning in terms of reliability and generalization. We have released the source code on GitHub, along with Google Colab tutorials and comprehensive documentation for beginners to get started. Besides, we present an online system for real-time knowledge editing, and a demo video.
pdf
bib
abs
Effective Demonstration Annotation for In-Context Learning via Language Model-Based Determinantal Point Process
Peng Wang
|
Xiaobin Wang
|
Chao Lou
|
Shengyu Mao
|
Pengjun Xie
|
Yong Jiang
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
In-context learning (ICL) is a few-shot learning paradigm that involves learning mappings through input-output pairs and appropriately applying them to new instances. Despite the remarkable ICL capabilities demonstrated by Large Language Models (LLMs), existing works are highly dependent on large-scale labeled support sets, not always feasible in practical scenarios. To refine this approach, we focus primarily on an innovative selective annotation mechanism, which precedes the standard demonstration retrieval. We introduce the Language Model-based Determinant Point Process (LM-DPP) that simultaneously considers the uncertainty and diversity of unlabeled instances for optimal selection. Consequently, this yields a subset for annotation that strikes a trade-off between the two factors. We apply LM-DPP to various language models, including GPT-J, LlaMA, and GPT-3. Experimental results on 9 NLU and 2 Generation datasets demonstrate that LM-DPP can effectively select canonical examples. Further analysis reveals that LLMs benefit most significantly from subsets that are both low uncertainty and high diversity.
pdf
bib
abs
RaFe: Ranking Feedback Improves Query Rewriting for RAG
Shengyu Mao
|
Yong Jiang
|
Boli Chen
|
Xiao Li
|
Peng Wang
|
Xinyu Wang
|
Pengjun Xie
|
Fei Huang
|
Huajun Chen
|
Ningyu Zhang
Findings of the Association for Computational Linguistics: EMNLP 2024
As Large Language Models (LLMs) and Retrieval Augmentation Generation (RAG) techniques have evolved, query rewriting has been widely incorporated into the RAG system for downstream tasks like open-domain QA to enhance document retrieval by reformulating queries. Many works have attempted to improve query rewriting in smaller models to avoid rewriting with costly LLMs, and the most common method is to employ reinforcement learning for feedback training. However, current methods require annotations (labeled relevant documents or downstream answers) or predesigned rewards for feedback, lack generalization, and fail to utilize signals tailored for query rewriting. In this paper, we propose RaFe, a framework for training query rewriting models. By leveraging reranker, RaFe provides ranking feedback aligned well with the rewriting objectives without needing signals from annotations and supports both online and offline training models. Experimental results demonstrate that with a general and publicly available reranker, RaFe can effectively steer the training for rewrite models.
pdf
bib
abs
Knowledge Mechanisms in Large Language Models: A Survey and Perspective
Mengru Wang
|
Yunzhi Yao
|
Ziwen Xu
|
Shuofei Qiao
|
Shumin Deng
|
Peng Wang
|
Xiang Chen
|
Jia-Chen Gu
|
Yong Jiang
|
Pengjun Xie
|
Fei Huang
|
Huajun Chen
|
Ningyu Zhang
Findings of the Association for Computational Linguistics: EMNLP 2024
Understanding knowledge mechanisms in Large Language Models (LLMs) is crucial for advancing towards trustworthy AGI. This paper reviews knowledge mechanism analysis from a novel taxonomy including knowledge utilization and evolution. Knowledge utilization delves into the mechanism of memorization, comprehension and application, and creation. Knowledge evolution focuses on the dynamic progression of knowledge within individual and group LLMs. Moreover, we discuss what knowledge LLMs have learned, the reasons for the fragility of parametric knowledge, and the potential dark knowledge (hypothesis) that will be challenging to address. We hope this work can help understand knowledge in LLMs and provide insights for future research.
2023
pdf
bib
abs
Knowledge Rumination for Pre-trained Language Models
Yunzhi Yao
|
Peng Wang
|
Shengyu Mao
|
Chuanqi Tan
|
Fei Huang
|
Huajun Chen
|
Ningyu Zhang
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Previous studies have revealed that vanilla pre-trained language models (PLMs) lack the capacity to handle knowledge-intensive NLP tasks alone; thus, several works have attempted to integrate external knowledge into PLMs. However, despite the promising outcome, we empirically observe that PLMs may have already encoded rich knowledge in their pre-trained parameters but fails to fully utilize them when applying to knowledge-intensive tasks. In this paper, we propose a new paradigm dubbed Knowledge Rumination to help the pre-trained language model utilize that related latent knowledge without retrieving them from the external corpus. By simply adding a prompt like “As far as I know” to the PLMs, we try to review related latent knowledge and inject them back into the model for knowledge consolidation. We apply the proposed knowledge rumination to various language models, including RoBERTa, DeBERTa, and GPT-3. Experimental results on six commonsense reasoning tasks and GLUE benchmarks demonstrate the effectiveness of our proposed approach, which proves that the knowledge stored in PLMs can be better exploited to enhance performance.
pdf
bib
abs
Editing Large Language Models: Problems, Methods, and Opportunities
Yunzhi Yao
|
Peng Wang
|
Bozhong Tian
|
Siyuan Cheng
|
Zhoubo Li
|
Shumin Deng
|
Huajun Chen
|
Ningyu Zhang
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Despite the ability to train capable LLMs, the methodology for maintaining their relevancy and rectifying errors remains elusive. To this end, the past few years have witnessed a surge in techniques for editing LLMs, the objective of which is to alter the behavior of LLMs efficiently within a specific domain without negatively impacting performance across other inputs. This paper embarks on a deep exploration of the problems, methods, and opportunities related to model editing for LLMs. In particular, we provide an exhaustive overview of the task definition and challenges associated with model editing, along with an in-depth empirical analysis of the most progressive methods currently at our disposal. We also build a new benchmark dataset to facilitate a more robust evaluation and pinpoint enduring issues intrinsic to existing techniques. Our objective is to provide valuable insights into the effectiveness and feasibility of each editing technique, thereby assisting the community in making informed decisions on the selection of the most appropriate method for a specific task or context.