Shanbao Qiao
2024
COMEM: In-Context Retrieval-Augmented Mass-Editing Memory in Large Language Models
Shanbao Qiao
|
Xuebing Liu
|
Seung-Hoon Na
Findings of the Association for Computational Linguistics: NAACL 2024
Noting that world knowledge continuously evolves over time, large language models (LLMs) need to be properly adjusted by performing the “knowledge editing”, which involves updating outdated information or correcting false information. To achieve reliable and “massive” editing capabilities in terms of generalization and specificity, this paper proposes a unified knowledge editing method called in-COntext retrieval-augmented Mass-Editing Memory (COMEM), which combines two types of editing approaches: parameter updating and in-context knowledge editing (IKE). In particular, COMEM incorporates retrieval-augmented IKE, a novel extension of IKE designed for massive editing tasks, based on an updating-aware demonstration construction.Experimental results on the zsRE and CounterFact datasets demonstrate that COMEM outperforms all existing methods, achieving state-of-the-art performance. Our code is available at https://github.com/JoveReCode/COMEM.git.
DistillMIKE: Editing Distillation of Massive In-Context Knowledge Editing in Large Language Models
Shanbao Qiao
|
Xuebing Liu
|
Seung-Hoon Na
Findings of the Association for Computational Linguistics: ACL 2024
Among the recently emerged knowledge editing methods, in-context knowledge editing (IKE) has shown respectable abilities on knowledge editing in terms of generalization and specificity. Noting the promising advantages but unexplored issues of IKE, we propose **DistillMIKE** as a novel extension of IKE, i.e., editing **distill**ation of "**M**assive” **I**n-context **K**nowledge **E**diting in large language models (LLMs), mainly consisting of two expansions; 1) *Massive in-context knowledge editing (MIKE)*, which extends IKE to a massive editing task, aiming to inject not a single edit but a set of massive edits to LLMs; To preserve specificity, our key novel extension is a “selective” retrieval augmentation, where the retrieval-augmented IKE is only applied to “in-scope” examples, whereas the unedited model without IKE is employed for “out-of-scope” ones. 2) *Editing distillation* of MIKE using low-rank adaptation (LoRA), which distills editing abilities of MIKE to parameters of LLMs in a manner of eliminating the need of lengthy in-context demonstrations, thus removing the computational overhead encountered at the inference time. Experimental results on the zsRE and CounterFact datasets demonstrate that MIKE shows the state-of-the-art perfomrances and DistilMIKE show comparable performances with MIKE. Our code is available at https://github.com/JoveReCode/DistillMIKE.git.
2023
DiffusionRet: Diffusion-Enhanced Generative Retriever using Constrained Decoding
Shanbao Qiao
|
Xuebing Liu
|
Seung-Hoon Na
Findings of the Association for Computational Linguistics: EMNLP 2023
Generative retrieval, which maps from a query to its relevant document identifiers (docids), has recently emerged as a new information retrieval (IR) paradigm, however, having suffered from 1) the lack of the intermediate reasoning step, caused by the manner of merely using a query to perform the hierarchical classification, and 2) the pretrain-finetune discrepancy, which comes from the use of the artificial symbols of docids. To address these limitations, we propose the novel approach of using the document generation from a query as an intermediate step before the retrieval, thus presenting ̲diffusion-enhanced generative ̲retrieval (DiffusionRet), which consists of two processing steps: 1) the diffusion-based document generation, which employs the sequence-to-sequence diffusion model to produce a pseudo document sample from a query, being expected to semantically close to a relevant document; 2) N-gram-based generative retrieval, which use another sequence-to-sequence model to generate n-grams that appear in the collection index for linking a generated sample to an original document. Experiment results on MS MARCO and Natural Questions dataset show that the proposed DiffusionRet significantly outperforms all the existing generative retrieval methods and leads to the state-of-the-art performances, even with much smaller number of parameters.
Search