Zhijing Wu

2025

A Persona-Aware LLM-Enhanced Framework for Multi-Session Personalized Dialogue Generation
Dongshuo Liu | Zhijing Wu | Dandan Song | Heyan Huang
Findings of the Association for Computational Linguistics: ACL 2025

Multi-session personalized dialogue generation is one of the most important topics in open-domain dialogue. It aims to generate responses consistent with the dialogue history and personality information across multiple sessions to engage users’ interest in the dialogue. Recent approaches focusing on history modeling and persona modeling have advanced the development of this field. However, they overlook the importance of dialogue structure in helping large language models (LLMs) understand the dialogue context. Moreover, these methods do not efficiently expand and utilize personality information, reducing the responses’ consistency. In this paper, we propose a Persona-Aware LLM-enAnCEd(PALACE) framework for multi-session personalized dialogue generation. Specifically, the framework consists of three components: a topic-aware memory bank, a persona prompt learning module, and VAE-LoRA. The topic-aware memory bank works by retrieving historical information that possesses a certain dialogue structure and relevant topics. The persona prompt learning module enhances the LLM’s persona-aware capabilities by utilizing a persona commonsense knowledge graph and a query-driven graph neural network. Furthermore, to enhance the generative capabilities of the LLM and obtain more useful prior knowledge, we combine VAE with LoRA to propose VAE-LoRA. Experimental results on the MSC and DuLeMon dataset demonstrate that our framework outperforms the state-of-the-art methods in automatic and human evaluation metrics.

pdf bib abs

GRV-KBQA: A Three-Stage Framework for Knowledge Base Question Answering with Decoupled Logical Structure, Semantic Grounding and Structure-Aware Validation
Yuhang Tian | Pan Yang | Dandan Song | Zhijing Wu | Hao Wang
Findings of the Association for Computational Linguistics: EMNLP 2025

Knowledge Base Question Answering (KBQA) is a fundamental task that enables natural language interaction with structured knowledge bases (KBs).Given a natural language question, KBQA aims to retrieve the answers from the KB. However, existing approaches, including retrieval-based, semantic parsing-based methods and large-language model-based methods often suffer from generating non-executable queries and inefficiencies in query execution. To address these challenges, we propose GRV-KBQA, a three-stage framework that decouples logical structure generation from semantic grounding and incorporates structure-aware validation to enhance accuracy. Unlike previous methods, GRV-KBQA explicitly enforces KB constraints to improve alignment between generated logical forms and KB structures. Experimental results on WebQSP and CWQ show that GRV-KBQA significantly improves performance over existing approaches. The ablation study conducted confirms the effectiveness of the decoupled logical form generation and validation mechanism of our framework.

pdf bib abs

Knowledge Base Question Answering (KBQA) aims to extract accurate answers from the Knowledge Base (KB). Traditional Semantic Parsing (SP)-based methods are widely used but struggle with complex queries. Recently, large language models (LLMs) have shown promise in improving KBQA performance. However, the challenge of generating error-free logical forms remains, as skeleton, topic Entity, and relation Errors still frequently occur. To address these challenges, we propose CompKBQA(Component-wise Task Decomposition for Knowledge Base Question Answering), a novel framework that optimizes the process of fine-tuning a LLM for generating logical forms by enabling the LLM to progressively learn relevant sub-tasks like skeleton generation, topic entity generation, and relevant relations generation. Additionally, we propose R³, which retrieves and incorporates KB information into the process of logical form generation. Experimental evaluations on two benchmark KBQA datasets, WebQSP and CWQ, demonstrate that CompKBQA achieves state-of-the-art performance, highlighting the importance of task decomposition and KB-aware learning.

pdf bib abs

Path-enhanced Pre-trained Language Model for Knowledge Graph Completion
Hao Wang | Dandan Song | Zhijing Wu | Yuhang Tian | Pan Yang
Findings of the Association for Computational Linguistics: EMNLP 2025

Pre-trained language models (PLMs) have achieved remarkable knowledge graph completion(KGC) success. However, most methods derive KGC results mainly from triple-level and text-described learning, which lack the capability to capture long-term relational and structural information. Moreover, the absence of a visible reasoning process leads to poor interpretability and credibility of the completions. In this paper, we propose a path-enhanced pre-trained language model-based knowledge graph completion method (PEKGC), which employs multi-view generation to infer missing facts in triple-level and path-level simultaneously to address lacking long-term relational information and interpretability issues. Furthermore, a neighbor selector module is proposed to filter neighbor triples to provide the adjacent structural information. Besides, we propose a fact-level re-evaluation and a heuristic fusion ranking strategy for candidate answers to fuse multi-view predictions. Extensive experiments on the benchmark datasets demonstrate that our model significantly improves the performance of the KGC task.

pdf bib abs

FlashBack: Efficient Retrieval-Augmented Language Modeling for Fast Inference
Runheng Liu | Xingchen Xiao | Heyan Huang | Zewen Chi | Zhijing Wu
Findings of the Association for Computational Linguistics: ACL 2025

Retrieval-Augmented Language Modeling (RALM) by integrating large language models (LLM) with relevant documents from an external corpus is a proven methodology for enabling the LLM to generate information beyond the scope of its pre-training corpus. Previous work by retrieving a set of tokens iteratively with retrieved content prepending to the input poses a high runtime issue, which degrades the inference efficiency of the LLMs because they fail to use the Key-Value (KV) cache efficiently. We propose FlashBack, a modular RALM designed to improve the inference efficiency of RALM with appending context pattern while maintaining decent performance after fine-tuning by Low-Rank Adaption. FlashBack appends retrieved documents at the end of the context for efficiently utilizing the KV cache. We also introduce the Marking Token as two special prompt tokens for marking the appending context during fine-tuning. Our experiments show that FlashBack can improve language modeling performance in perplexity metric. We proved the Marking Token is a usable add-on when fine-tuning models on specific context patterns. By bypassing unnecessary re-computation, FlashBack achieves fast inference speed speed with long context input. The inference speed is up to 4× faster than the prepending counterpart on a 7B LLM (Llama 2) in the runtime test.

pdf bib abs

As Large Language Models (LLMs) demonstrate increasingly strong human-like capabilities, the need to align them with human values has become significant. Recent advanced techniques, such as prompt learning and reinforcement learning, are being employed to bring LLMs closer to aligning with human values. While these techniques address broad ethical and helpfulness concerns, they rarely consider simulating individualized human values. To bridge this gap, we propose SimVBG, a framework that simulates individual values based on individual backstories that reflect their past experience and demographic information. SimVBG transforms structured data on an individual to a backstory and utilizes a multi-module architecture inspired by the Cognitive–Affective Personality System to simulate individual value based on the backstories. We test SimVBG on a self-constructed benchmark derived from the World Values Survey and show that SimVBG improves top-1 accuracy by more than 10% over the retrieval-augmented generation method. Further analysis shows that performance increases as additional interaction user history becomes available, indicating that the model can refine its persona over time. Code, dataset, and complete experimental results are available at https://github.com/bangdedadi/SimVBG.

2024

pdf bib abs

DRAGIN: Dynamic Retrieval Augmented Generation based on the Real-time Information Needs of Large Language Models
Weihang Su | Yichen Tang | Qingyao Ai | Zhijing Wu | Yiqun Liu
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Dynamic retrieval augmented generation (RAG) paradigm actively decides when and what to retrieve during the text generation process of Large Language Models (LLMs).There are two key elements of this paradigm: identifying the optimal moment to activate the retrieval module (deciding when to retrieve) and crafting the appropriate query once retrieval is triggered (determining what to retrieve).However, current dynamic RAG methods fall short in both aspects. Firstly, the strategies for deciding when to retrieve often rely on static rules. Moreover, the strategies for deciding what to retrieve typically limit themselves to the LLM’s most recent sentence or the last few tokens, while the LLM’s information needs may span across the entire context.To overcome these limitations, we introduce a new framework, DRAGIN, i.e., Dynamic Retrieval Augmented Generation based on the Information Needs of LLMs. Our framework is specifically designed to make decisions on when and what to retrieve based on the LLM’s information needs during the text generation process.We evaluate DRAGIN along with existing methods comprehensively over 4 knowledge-intensive generation datasets. Experimental results show that DRAGIN achieves superior performance on all tasks, demonstrating the effectiveness of our method.

pdf bib abs

In event argument extraction (EAE), a promising approach involves jointly encoding text and argument roles, and performing multiple token linking operations. This approach further falls into two categories. One extracts arguments within a single event, while the other attempts to extract arguments from multiple events simultaneously. However, the former lacks to leverage cross-event information and the latter requires tougher predictions with longer encoded role sequences and extra linking operations. In this paper, we design a novel separation-and-fusion paradigm to separately acquire cross-event information and fuse it into the argument extraction of a target event. Following the paradigm, we propose a novel multiple token linking model named Sep2F, which can effectively build event correlations via roles and preserve the simple linking predictions of single-event extraction. In particular, we employ one linking module to extract arguments for the target event and another to aggregate the role information of multiple events. More importantly, we propose a novel two-fold fusion module to ensure that the aggregated cross-event information serves EAE well. We evaluate our proposed model on sentence-level and document-level datasets, including ACE05, RAMS, WikiEvents and MLEE. The extensive experimental results indicate that our model outperforms the state-of-the-art EAE models on all the datasets.

pdf bib abs

PEK: A Parameter-Efficient Framework for Knowledge-Grounded Dialogue Generation
Pan Yang | Dandan Song | Zhijing Wu | Yanru Zhou
Findings of the Association for Computational Linguistics: ACL 2024

Pre-trained language models (PLMs) have shown great dialogue generation capability in different scenarios. However, the huge VRAM consumption when fine-tuning them is one of their drawbacks. PEFT approaches can significantly reduce the number of trainable parameters, which enables us to fine-tune larger dialogue generation models. However, the reduction in parameter quantity can diminish a PLM’s expressive capacity and affect the PLM’s learning from certain specific examples like knowledge-related conversations. Previous works have demonstrated that injecting external knowledge into dialogue generation models can improve the model’s performance in knowledge-related conversations. Nonetheless, these methods are designed for the scenario where most parameters of the entire framework are trainable. In this paper, we propose PEK, a parameter-efficient framework for knowledge-enhanced dialogue generation. It enables PLMs to leverage external knowledge documents and knowledge graphs to enhance its generation capabilities with an acceptable number of trainable parameters. Evaluation results on the Wizard of Wikipedia and CMU_DoG datasets show that our approach outperforms baseline methods on multiple evaluation metrics, which validates the effectiveness of our approach.

pdf bib abs

Hallucinations in large language models (LLMs) refer to the phenomenon of LLMs producing responses that are coherent yet factually inaccurate. This issue undermines the effectiveness of LLMs in practical applications, necessitating research into detecting and mitigating hallucinations of LLMs. Previous studies have mainly concentrated on post-processing techniques for hallucination detection, which tend to be computationally intensive and limited in effectiveness due to their separation from the LLM’s inference process. To overcome these limitations, we introduce MIND, an unsupervised training framework that leverages the internal states of LLMs for real-time hallucination detection without requiring manual annotations. Additionally, we present HELM, a new benchmark for evaluating hallucination detection across multiple LLMs, featuring diverse LLM outputs and the internal states of LLMs during their inference process. Our experiments demonstrate that MIND outperforms existing state-of-the-art methods in hallucination detection.

pdf bib abs

Recently, significant progress has been made in employing Large Language Models (LLMs) for semantic parsing to address Knowledge Base Question Answering (KBQA) tasks. Previous work utilize LLMs to generate query statements on Knowledge Bases (KBs) for retrieving answers. However, LLMs often generate incorrect query statements due to the lack of relevant knowledge in the previous methods. To address this, we propose a framework called Augmenting Reasoning Capabilities of LLMs with Graph Structures in Knowledge Base Question Answering (ARG-KBQA), which retrieves question-related graph structures to improve the performance of LLMs. Unlike other methods that directly retrieve relations or triples from KBs, we introduce an unsupervised two-stage ranker to perform multi-hop beam search on KBs, which could provide LLMs with more relevant information to the questions. Experimental results demonstrate that ARG-KBQA sets a new state-of-the-art on GrailQA and WebQSP under the few-shot setting. Additionally, ARG-KBQA significantly outperforms previous few-shot methods on questions with unseen query statement in the training data.

2022

pdf bib abs

A Multi-turn Machine Reading Comprehension Framework with Rethink Mechanism for Emotion-Cause Pair Extraction
Changzhi Zhou | Dandan Song | Jing Xu | Zhijing Wu
Proceedings of the 29th International Conference on Computational Linguistics

Emotion-cause pair extraction (ECPE) is an emerging task in emotion cause analysis, which extracts potential emotion-cause pairs from an emotional document. Most recent studies use end-to-end methods to tackle the ECPE task. However, these methods either suffer from a label sparsity problem or fail to model complicated relations between emotions and causes. Furthermore, they all do not consider explicit semantic information of clauses. To this end, we transform the ECPE task into a document-level machine reading comprehension (MRC) task and propose a Multi-turn MRC framework with Rethink mechanism (MM-R). Our framework can model complicated relations between emotions and causes while avoiding generating the pairing matrix (the leading cause of the label sparsity problem). Besides, the multi-turn structure can fuse explicit semantic information flow between emotions and causes. Extensive experiments on the benchmark emotion cause corpus demonstrate the effectiveness of our proposed framework, which outperforms existing state-of-the-art methods.

pdf bib abs

Continual Machine Reading Comprehension via Uncertainty-aware Fixed Memory and Adversarial Domain Adaptation
Zhijing Wu | Hua Xu | Jingliang Fang | Kai Gao
Findings of the Association for Computational Linguistics: NAACL 2022

Continual Machine Reading Comprehension aims to incrementally learn from a continuous data stream across time without access the previous seen data, which is crucial for the development of real-world MRC systems. However, it is a great challenge to learn a new domain incrementally without catastrophically forgetting previous knowledge. In this paper, MA-MRC, a continual MRC model with uncertainty-aware fixed Memory and Adversarial domain adaptation, is proposed. In MA-MRC, a fixed size memory stores a small number of samples in previous domain data along with an uncertainty-aware updating strategy when new domain data arrives. For incremental learning, MA-MRC not only keeps a stable understanding by learning both memory and new domain data, but also makes full use of the domain adaptation relationship between them by adversarial learning strategy. The experimental results show that MA-MRC is superior to strong baselines and has a substantial incremental learning ability without catastrophically forgetting under two different continual MRC settings.