Shanbao Qiao

2025

SeqMMR: Sequential Model Merging and LLM Routing for Enhanced Batched Sequential Knowledge Editing
Shanbao Qiao | Xuebing Liu | Akshat Gupta | Seung-Hoon Na
Findings of the Association for Computational Linguistics: ACL 2025

Model knowledge editing enables the efficient correction of erroneous information and the continuous updating of outdated knowledge within language models. While existing research has demonstrated strong performance in single-instance or few-instance sequential editing and one-time massive editing scenarios, the batched sequential editing paradigm remains a significant challenge. The primary issue lies in the model’s tendency to gradually forget previously edited knowledge and become increasingly unstable after multiple iterations of batched editing. To address these challenges, we propose **SeqMMR**, an enhanced framework for batched sequential knowledge editing that leverages **Seq**uential **M**odel **M**erging and a model **R**outer. Our approach iteratively merges parameters from current batch-edited models with those of their predecessors, ensuring that newly emerging knowledge is integrated while mitigating the forgetting of previously edited knowledge. Furthermore, the model router directs queries unrelated to the edited knowledge to an unedited model backup, preventing unintended alterations in model predictions. Extensive experiments across various datasets demonstrate that our approach effectively mitigates knowledge forgetting, improves performance across all previous batches, and better preserves the model’s general capabilities.

pdf bib abs

GenPoE: Generative Passage-level Mixture of Experts for Knowledge Enhancement of LLMs
Xuebing Liu | Shanbao Qiao | Seung-Hoon Na
Findings of the Association for Computational Linguistics: EMNLP 2025

Typically, parametric adaptation methods such as domain-adaptive pretraining (DAP) and retrieval-augmented generation (RAG) have been considered effective approaches for adapting large language models (LLMs) to new knowledge or domains. To unify positive effects of parametric adaptation and RAG, this paper proposes GenPoE, i.e., “generative’’ passage-level mixture of experts (MoEs) for enhancing knowledge of LLMs. The key component is its novel MoE-generating hypernetwork which takes in-context retrieved passages and generates their “expert’’ parameters, where these generated parameters are then integrated into LLMs by forming expert networks. With its use of “generated’’ parameters, GenPoE does not require a separate parameter training or finetuning stage, which is often costly. By parameterizing passages into expert networks, GenPoE likely exhibits robustness even when the retrieved passages are irrelevant. Experiment results in two open-domain question answering (QA) tasks present that GenPoE shows improved performances over other passage-level knowledge editing, and its combination of RAG produces superior performances over RAG. Our data and code will be available at https://github.com/Liu-Xuebing/GenPoE.

2024

pdf bib

COMEM: In-Context Retrieval-Augmented Mass-Editing Memory in Large Language Models
Shanbao Qiao | Xuebing Liu | Seung-Hoon Na
Findings of the Association for Computational Linguistics: NAACL 2024

pdf bib abs

DistillMIKE: Editing Distillation of Massive In-Context Knowledge Editing in Large Language Models
Shanbao Qiao | Xuebing Liu | Seung-Hoon Na
Findings of the Association for Computational Linguistics: ACL 2024

Among the recently emerged knowledge editing methods, in-context knowledge editing (IKE) has shown respectable abilities on knowledge editing in terms of generalization and specificity. Noting the promising advantages but unexplored issues of IKE, we propose **DistillMIKE** as a novel extension of IKE, i.e., editing **distill**ation of "**M**assive” **I**n-context **K**nowledge **E**diting in large language models (LLMs), mainly consisting of two expansions; 1) *Massive in-context knowledge editing (MIKE)*, which extends IKE to a massive editing task, aiming to inject not a single edit but a set of massive edits to LLMs; To preserve specificity, our key novel extension is a “selective” retrieval augmentation, where the retrieval-augmented IKE is only applied to “in-scope” examples, whereas the unedited model without IKE is employed for “out-of-scope” ones. 2) *Editing distillation* of MIKE using low-rank adaptation (LoRA), which distills editing abilities of MIKE to parameters of LLMs in a manner of eliminating the need of lengthy in-context demonstrations, thus removing the computational overhead encountered at the inference time. Experimental results on the zsRE and CounterFact datasets demonstrate that MIKE shows the state-of-the-art perfomrances and DistilMIKE show comparable performances with MIKE. Our code is available at https://github.com/JoveReCode/DistillMIKE.git.

2023

pdf bib abs

DiffusionRet: Diffusion-Enhanced Generative Retriever using Constrained Decoding
Shanbao Qiao | Xuebing Liu | Seung-Hoon Na
Findings of the Association for Computational Linguistics: EMNLP 2023

Generative retrieval, which maps from a query to its relevant document identifiers (docids), has recently emerged as a new information retrieval (IR) paradigm, however, having suffered from 1) the lack of the intermediate reasoning step, caused by the manner of merely using a query to perform the hierarchical classification, and 2) the pretrain-finetune discrepancy, which comes from the use of the artificial symbols of docids. To address these limitations, we propose the novel approach of using the document generation from a query as an intermediate step before the retrieval, thus presenting ̲diffusion-enhanced generative ̲retrieval (DiffusionRet), which consists of two processing steps: 1) the diffusion-based document generation, which employs the sequence-to-sequence diffusion model to produce a pseudo document sample from a query, being expected to semantically close to a relevant document; 2) N-gram-based generative retrieval, which use another sequence-to-sequence model to generate n-grams that appear in the collection index for linking a generated sample to an original document. Experiment results on MS MARCO and Natural Questions dataset show that the proposed DiffusionRet significantly outperforms all the existing generative retrieval methods and leads to the state-of-the-art performances, even with much smaller number of parameters.

Co-authors

Venues

Findings5

Fix author