Yanchi Liu
2026
Multi-Agent Procedural Graph Extraction with Structural and Logical Refinement
Wangyang Ying | Yanchi Liu | Xujiang Zhao | Wei Cheng | Zhengzhang Chen | Wenchao Yu | Yanjie Fu | Haifeng Chen
Findings of the Association for Computational Linguistics: EACL 2026
Wangyang Ying | Yanchi Liu | Xujiang Zhao | Wei Cheng | Zhengzhang Chen | Wenchao Yu | Yanjie Fu | Haifeng Chen
Findings of the Association for Computational Linguistics: EACL 2026
Automatically extracting workflows as procedural graphs from natural language is a promising yet underexplored task that requires ensuring both structural validity and logical alignment. Recent advances in large language models (LLMs) show potential for graph extraction, but often yield ill-formed structures or misinterpret logical constructs such as gateways. We introduce , a multi-agent framework that treats procedural graph extraction as a multi-round reasoning process with structural and logical refinement agents. The framework operates in three iterative stages: (1) an LLM-based graph extraction phase, (2) a structural feedback phase where a simulation agent diagnoses and explains structural issues, and (3) a logical feedback phase where a semantic agent aligns semantics between flow logic and linguistic cues in the source text. Important feedback is prioritized and expressed in natural language, which is injected into the next-round prompt, enabling interpretable and controllable refinement. This modular design allows agents to target distinct error types without supervision or parameter updates. Experiments demonstrate that achieves substantial improvements in both structural correctness and logical consistency over strong baselines.
DeepSieve: Information Sieving via LLM-as-a-Knowledge-Router
Minghao Guo | Qingcheng Zeng | Xujiang Zhao | Yanchi Liu | Wenchao Yu | Mengnan Du | Haifeng Chen | Wei Cheng
Findings of the Association for Computational Linguistics: EACL 2026
Minghao Guo | Qingcheng Zeng | Xujiang Zhao | Yanchi Liu | Wenchao Yu | Mengnan Du | Haifeng Chen | Wei Cheng
Findings of the Association for Computational Linguistics: EACL 2026
Large Language Models (LLMs) excel at many reasoning tasks but struggle with knowledge-intensive queries due to their inability to dynamically access up-to-date or domain-specific information. Retrieval-Augmented Generation (RAG) has emerged as a promising solution, enabling LLMs to ground their responses in external sources. However, existing RAG methods lack fine-grained control over both the query and source sides, often resulting in noisy retrieval and shallow reasoning. In this work, we introduce DeepSieve, an agentic RAG framework that incorporates information sieving via LLM-as-a-knowledge-router. DeepSieve decomposes complex queries into structured sub-questions and recursively routes each to the most suitable knowledge source, filtering irrelevant information through a multi-stage distillation process. Our design emphasizes modularity, transparency, and adaptability, leveraging recent advances in agentic system design. Experiments on multi-hop QA tasks across heterogeneous sources demonstrate improved reasoning depth, retrieval precision, and interpretability over conventional RAG approaches.
Decoding Time Series with LLMs: A Multi-Agent Framework for Cross-Domain Annotation
Minhua Lin | Zhengzhang Chen | Yanchi Liu | Xujiang Zhao | Zongyu Wu | Junxiang Wang | Xiang Zhang | Suhang Wang | Haifeng Chen
Findings of the Association for Computational Linguistics: EACL 2026
Minhua Lin | Zhengzhang Chen | Yanchi Liu | Xujiang Zhao | Zongyu Wu | Junxiang Wang | Xiang Zhang | Suhang Wang | Haifeng Chen
Findings of the Association for Computational Linguistics: EACL 2026
Time series data is ubiquitous across various domains, including manufacturing, finance, and healthcare. High-quality annotations are essential for effectively understanding time series and facilitating downstream tasks. However, obtaining such annotations is challenging, particularly in mission-critical domains. In this paper, we propose TESSA, a multi-agent system designed to automatically generate both general and domain-specific annotations for time series data. TESSA introduces two agents: a general annotation agent and a domain-specific annotation agent. The general agent captures common patterns and knowledge across multiple source domains, leveraging both time-series-wise and text-wise features to generate general annotations. Meanwhile, the domain-specific agent utilizes limited annotations from the target domain to learn domain-specific terminology and generate targeted annotations. Extensive experiments on multiple synthetic and real-world datasets demonstrate that TESSA effectively generates high-quality annotations, outperforming existing methods.
2025
Uncertainty Propagation on LLM Agent
Qiwei Zhao | Dong Li | Yanchi Liu | Wei Cheng | Yiyou Sun | Mika Oishi | Takao Osaki | Katsushi Matsuda | Huaxiu Yao | Chen Zhao | Haifeng Chen | Xujiang Zhao
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Qiwei Zhao | Dong Li | Yanchi Liu | Wei Cheng | Yiyou Sun | Mika Oishi | Takao Osaki | Katsushi Matsuda | Huaxiu Yao | Chen Zhao | Haifeng Chen | Xujiang Zhao
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large language models (LLMs) integrated into multi-step agent systems enable complex decision-making processes across various applications. However, their outputs often lack reliability, making uncertainty estimation crucial. Existing uncertainty estimation methods primarily focus on final-step outputs, which fail to account for cumulative uncertainty over the multi-step decision-making process and the dynamic interactions between agents and their environments. To address these limitations, we propose SAUP (Situation Awareness Uncertainty Propagation), a novel framework that propagates uncertainty through each step of an LLM-based agent’s reasoning process. SAUP incorporates situational awareness by assigning situational weights to each step’s uncertainty during the propagation. Our method, compatible with various one-step uncertainty estimation techniques, provides a comprehensive and accurate uncertainty measure. Extensive experiments on benchmark datasets demonstrate that SAUP significantly outperforms existing state-of-the-art methods, achieving up to 20% improvement in AUROC.
MixLLM: Dynamic Routing in Mixed Large Language Models
Xinyuan Wang | Yanchi Liu | Wei Cheng | Xujiang Zhao | Zhengzhang Chen | Wenchao Yu | Yanjie Fu | Haifeng Chen
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Xinyuan Wang | Yanchi Liu | Wei Cheng | Xujiang Zhao | Zhengzhang Chen | Wenchao Yu | Yanjie Fu | Haifeng Chen
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Large Language Models (LLMs) exhibit potential artificial generic intelligence recently, however, their usage is costly with high response latency. Given mixed LLMs with their own strengths and weaknesses, LLM routing aims to identify the most suitable model for each query in the stream to maximize response quality and minimize cost and latency. However, the challenges involve: (1) dynamic trade-offs among quality, cost, and latency; (2) enabling continual learning in deployed systems; and (3) navigating a varying (e.g., new LLM addition or old LLM removal) set of LLM candidates over time. To bridge these gaps, we develop MixLLM, a dynamic contextual-bandit-based routing system for query-LLM assignment. Specifically, we first leverage query tags to enhance query embeddings for the routing task. Next, we design lightweight prediction models to estimate the response qualities and costs of queries over LLMs. We then devise a meta-decision maker to choose the query-LLM assignments to best tradeoff response quality, cost, and latency. Finally, the system benefits from continual training, allowing it to adapt to evolving queries and user feedback over time. Our extensive experiments show that MixLLM achieves the best trade-offs in response quality, cost, and latency (97.25% of GPT-4’s quality at 24.18% of the cost under the time constraint).
2024
Large Language Models Can Be Contextual Privacy Protection Learners
Yijia Xiao | Yiqiao Jin | Yushi Bai | Yue Wu | Xianjun Yang | Xiao Luo | Wenchao Yu | Xujiang Zhao | Yanchi Liu | Quanquan Gu | Haifeng Chen | Wei Wang | Wei Cheng
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Yijia Xiao | Yiqiao Jin | Yushi Bai | Yue Wu | Xianjun Yang | Xiao Luo | Wenchao Yu | Xujiang Zhao | Yanchi Liu | Quanquan Gu | Haifeng Chen | Wei Wang | Wei Cheng
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
The proliferation of Large Language Models (LLMs) has driven considerable interest in fine-tuning them with domain-specific data to create specialized language models. Nevertheless, such domain-specific fine-tuning data often contains contextually sensitive personally identifiable information (PII). Direct fine-tuning LLMs on this data without privacy protection poses a risk of data leakage of sensitive PII during inference time. To address this challenge, we introduce Contextual Privacy Protection Language Models (CPPLM), a novel paradigm for fine-tuning LLMs that effectively injects domain-specific knowledge while safeguarding inference-time data privacy. Our work offers a theoretical analysis for model design and delves into various techniques such as corpus curation, penalty-based unlikelihood in training loss, and instruction-based tuning, etc. Extensive experiments across diverse datasets and scenarios demonstrate the effectiveness of our approaches. In particular, instruction tuning with both positive and negative examples, stands out as a promising method, effectively protecting private data while enhancing the model’s knowledge. Our work underscores the potential for Large Language Models as robust contextual privacy protection learners.
Pruning as a Domain-specific LLM Extractor
Nan Zhang | Yanchi Liu | Xujiang Zhao | Wei Cheng | Runxue Bao | Rui Zhang | Prasenjit Mitra | Haifeng Chen
Findings of the Association for Computational Linguistics: NAACL 2024
Nan Zhang | Yanchi Liu | Xujiang Zhao | Wei Cheng | Runxue Bao | Rui Zhang | Prasenjit Mitra | Haifeng Chen
Findings of the Association for Computational Linguistics: NAACL 2024
Large Language Models (LLMs) have exhibited remarkable proficiency across a wide array of NLP tasks. However, the escalation in model size also engenders substantial deployment costs. While few efforts have explored model pruning techniques to reduce the size of LLMs, they mainly center on general or task-specific weights. This leads to suboptimal performance due to lacking specificity on the target domain or generality on different tasks when applied to domain-specific challenges. This work introduces an innovative unstructured dual-pruning methodology, D-Pruner, for domain-specific compression on LLM. It extracts a compressed, domain-specific, and task- agnostic LLM by identifying LLM weights that are pivotal for general capabilities, like linguistic capability and multi-task solving, and domain-specific knowledge. More specifically, we first assess general weight importance by quantifying the error incurred upon their removal with the help of an open-domain calibration dataset. Then, we utilize this general weight importance to refine the training loss, so that it preserves generality when fitting into a specific domain. Moreover, by efficiently approximating weight importance with the refined training loss on a domain-specific calibration dataset, we obtain a pruned model emphasizing generality and specificity. Our comprehensive experiments across various tasks in healthcare and legal domains show the effectiveness of D-Pruner in domain-specific compression. Our code is available at https://github.com/psunlpgroup/D-Pruner.
Distantly-Supervised Joint Extraction with Noise-Robust Learning
Yufei Li | Xiao Yu | Yanghong Guo | Yanchi Liu | Haifeng Chen | Cong Liu
Findings of the Association for Computational Linguistics: ACL 2024
Yufei Li | Xiao Yu | Yanghong Guo | Yanchi Liu | Haifeng Chen | Cong Liu
Findings of the Association for Computational Linguistics: ACL 2024
Joint entity and relation extraction is a process that identifies entity pairs and their relations using a single model. We focus on the problem of joint extraction in distantly-labeled data, whose labels are generated by aligning entity mentions with the corresponding entity and relation tags using a knowledge base (KB). One key challenge is the presence of noisy labels arising from both incorrect entity and relation annotations, which significantly impairs the quality of supervised learning. Existing approaches, either considering only one source of noise or making decisions using external knowledge, cannot well-utilize significant information in the training data. We propose DENRL, a generalizable framework that 1) incorporates a lightweight transformer backbone into a sequence labeling scheme for joint tagging, and 2) employs a noise-robust framework that regularizes the tagging model with significant relation patterns and entity-relation dependencies, then iteratively self-adapts to instances with less noise from both sources. Surprisingly, experiments on two benchmark datasets show that DENRL, using merely its own parametric distribution and simple data-driven heuristics, outperforms strong baselines by a large margin with better interpretability.
InfuserKI: Enhancing Large Language Models with Knowledge Graphs via Infuser-Guided Knowledge Integration
Fali Wang | Runxue Bao | Suhang Wang | Wenchao Yu | Yanchi Liu | Wei Cheng | Haifeng Chen
Findings of the Association for Computational Linguistics: EMNLP 2024
Fali Wang | Runxue Bao | Suhang Wang | Wenchao Yu | Yanchi Liu | Wei Cheng | Haifeng Chen
Findings of the Association for Computational Linguistics: EMNLP 2024
Large Language Models (LLMs) have achieved exceptional capabilities in open generation across various domains, yet they encounter difficulties with tasks that require intensive knowledge. To address these challenges, methods for integrating knowledge have been developed, which augment LLMs with domain-specific knowledge graphs through external modules. These approaches, however, face data inefficiency issues as they necessitate the processing of both known and unknown knowledge for fine-tuning. Thus, our research focuses on a novel problem: efficiently integrating unknown knowledge into LLMs without unnecessary overlap of known knowledge. A risk of introducing new knowledge is the potential forgetting of existing knowledge. To mitigate this risk, we propose the innovative InfuserKI framework. This framework employs transformer internal states to determine when to enrich LLM outputs with additional information, effectively preventing knowledge forgetting. Performance evaluations using the UMLS-2.5k and MetaQA domain knowledge graphs reveal that InfuserKI not only successfully integrates new knowledge but also outperforms state-of-the-art baselines, reducing knowledge forgetting by 9% and 6%, respectively.
Uncertainty Quantification for In-Context Learning of Large Language Models
Chen Ling | Xujiang Zhao | Xuchao Zhang | Wei Cheng | Yanchi Liu | Yiyou Sun | Mika Oishi | Takao Osaki | Katsushi Matsuda | Jie Ji | Guangji Bai | Liang Zhao | Haifeng Chen
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Chen Ling | Xujiang Zhao | Xuchao Zhang | Wei Cheng | Yanchi Liu | Yiyou Sun | Mika Oishi | Takao Osaki | Katsushi Matsuda | Jie Ji | Guangji Bai | Liang Zhao | Haifeng Chen
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
In-context learning has emerged as a groundbreaking ability of Large Language Models (LLMs) and revolutionized various fields by providing a few task-relevant demonstrations in the prompt. However, trustworthy issues with LLM’s response, such as hallucination, have also been actively discussed. Existing works have been devoted to quantifying the uncertainty in LLM’s response, but they often overlook the complex nature of LLMs and the uniqueness of in-context learning. In this work, we delve into the predictive uncertainty of LLMs associated with in-context learning, highlighting that such uncertainties may stem from both the provided demonstrations (aleatoric uncertainty) and ambiguities tied to the model’s configurations (epistemic uncertainty). We propose a novel formulation and corresponding estimation method to quantify both types of uncertainties. The proposed method offers an unsupervised way to understand the prediction of in-context learning in a plug-and-play fashion. Extensive experiments are conducted to demonstrate the effectiveness of the decomposition. The code and data are available at: https://github.com/lingchen0331/UQ_ICL.
2023
Uncertainty-Aware Bootstrap Learning for Joint Extraction on Distantly-Supervised Data
Yufei Li | Xiao Yu | Yanchi Liu | Haifeng Chen | Cong Liu
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Yufei Li | Xiao Yu | Yanchi Liu | Haifeng Chen | Cong Liu
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Jointly extracting entity pairs and their relations is challenging when working on distantly-supervised data with ambiguous or noisy labels. To mitigate such impact, we propose uncertainty-aware bootstrap learning, which is motivated by the intuition that the higher uncertainty of an instance, the more likely the model confidence is inconsistent with the ground truths. Specifically, we first explore instance-level data uncertainty to create an initial high-confident examples. Such subset serves as filtering noisy instances and facilitating the model to converge fast at the early stage. During bootstrap learning, we propose self-ensembling as a regularizer to alleviate inter-model uncertainty produced by noisy labels. We further define probability variance of joint tagging probabilities to estimate inner-model parametric uncertainty, which is used to select and build up new reliable training instances for the next iteration. Experimental results on two large datasets reveal that our approach outperforms existing strong baselines and related methods.
Open-ended Commonsense Reasoning with Unrestricted Answer Candidates
Chen Ling | Xuchao Zhang | Xujiang Zhao | Yanchi Liu | Wei Cheng | Mika Oishi | Takao Osaki | Katsushi Matsuda | Haifeng Chen | Liang Zhao
Findings of the Association for Computational Linguistics: EMNLP 2023
Chen Ling | Xuchao Zhang | Xujiang Zhao | Yanchi Liu | Wei Cheng | Mika Oishi | Takao Osaki | Katsushi Matsuda | Haifeng Chen | Liang Zhao
Findings of the Association for Computational Linguistics: EMNLP 2023
Open-ended Commonsense Reasoning is defined as solving a commonsense question without providing 1) a short list of answer candidates and 2) a pre-defined answer scope. Conventional ways of formulating the commonsense question into a question-answering form or utilizing external knowledge to learn retrieval-based methods are less applicable in the open-ended setting due to an inherent challenge. Without pre-defining an answer scope or a few candidates, open-ended commonsense reasoning entails predicting answers by searching over an extremely large searching space. Moreover, most questions require implicit multi-hop reasoning, which presents even more challenges to our problem. In this work, we leverage pre-trained language models to iteratively retrieve reasoning paths on the external knowledge base, which does not require task-specific supervision. The reasoning paths can help to identify the most precise answer to the commonsense question. We conduct experiments on two commonsense benchmark datasets. Compared to other approaches, our proposed method achieves better performance both quantitatively and qualitatively.
2021
Unsupervised Concept Representation Learning for Length-Varying Text Similarity
Xuchao Zhang | Bo Zong | Wei Cheng | Jingchao Ni | Yanchi Liu | Haifeng Chen
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Xuchao Zhang | Bo Zong | Wei Cheng | Jingchao Ni | Yanchi Liu | Haifeng Chen
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Measuring document similarity plays an important role in natural language processing tasks. Most existing document similarity approaches suffer from the information gap caused by context and vocabulary mismatches when comparing varying-length texts. In this paper, we propose an unsupervised concept representation learning approach to address the above issues. Specifically, we propose a novel Concept Generation Network (CGNet) to learn concept representations from the perspective of the entire text corpus. Moreover, a concept-based document matching method is proposed to leverage advances in the recognition of local phrase features and corpus-level concept features. Extensive experiments on real-world data sets demonstrate that new method can achieve a considerable improvement in comparing length-varying texts. In particular, our model achieved 6.5% better F1 Score compared to the best of the baseline models for a concept-project benchmark dataset.
Search
Fix author
Co-authors
- Wei Cheng 10
- Haifeng Chen 9
- Xujiang Zhao 9
- Wenchao Yu 5
- Zhengzhang Chen 3
- Haifeng Chen 3
- Katsushi Matsuda 3
- Mika Oishi 3
- Takao Osaki 3
- Xuchao Zhang 3
- Runxue Bao 2
- Yanjie Fu 2
- Yufei Li 2
- Chen Ling 2
- Cong Liu 2
- Yiyou Sun 2
- Suhang Wang 2
- Xiao Yu 2
- Liang Zhao (赵亮) 2
- Yushi Bai 1
- Guangji Bai 1
- Haifeng Chen 1
- Mengnan Du 1
- Quanquan Gu 1
- Yanghong Guo 1
- Minghao Guo 1
- Jie Ji 1
- Yiqiao Jin 1
- Dong Li 1
- Minhua Lin 1
- Xiao Luo 1
- Prasenjit Mitra 1
- Jingchao Ni 1
- Wei Wang 1
- Fali Wang 1
- Xinyuan Wang 1
- Junxiang Wang 1
- Yue Wu 1
- Zongyu Wu 1
- Yijia Xiao 1
- Xianjun Yang 1
- Huaxiu Yao 1
- Wangyang Ying 1
- Qingcheng Zeng 1
- Nan Zhang 1
- Rui Zhang 1
- Xiang Zhang 1
- Qiwei Zhao 1
- Chen Zhao 1
- Bo Zong 1