XueYou Zhang
2025
Beyond Instruction Following: Evaluating Inferential Rule Following of Large Language Models
Wangtao Sun | ChenxiangZhang ChenxiangZhang | XueYou Zhang | Xuanqing Yu | Ziyang Huang | Haotian Xu | Shizhu He | Jun Zhao | Kang Liu
Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025)
Wangtao Sun | ChenxiangZhang ChenxiangZhang | XueYou Zhang | Xuanqing Yu | Ziyang Huang | Haotian Xu | Shizhu He | Jun Zhao | Kang Liu
Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025)
"Although Large Language Models (LLMs) have demonstrated strong instruction-following abil-ity, they are further supposed to be controlled and guided by inferential rules in real-world scenarios to be safe, accurate, and intelligent. This demands the possession of inferential rule-following capability of LLMs. However, no prior work has made a clear evaluation of the inferential rule-following capability of LLMs. Previous studies that try to evaluate the inferential rule-following capability of LLMs fail to distinguish the inferential rule-following scenarios from the instruction-following scenarios. Therefore, this paper first clarifies the concept of inferential rule-following and proposes a comprehensive benchmark, RuleBench, to evaluate a diversified range of inferential rule-following abilities. Our experimental results on a variety of LLMs show that they are still limited in following rules. Our analysis based on the evaluation results provides insights into the improvements for LLMs toward a better inferential rule-following intelligent agent. We further propose Inferential Rule-Following Tuning (IRFT). The experimental results show that through IRFT, LLMs can learn abstract inferential rule-following abilities from purely synthetic data and then generalize to RuleBench. The data and code can be found at:https://gitee.com/forangel2014/llm-rule-following-code"
KMatrix-2: A Comprehensive Heterogeneous Knowledge Collaborative Enhancement Toolkit for Large Language Model
Shun Wu | Di Wu | Wangtao Sun | Ziyang Huang | Xiaowei Yuan | Kun Luo | XueYou Zhang | Shizhu He | Jun Zhao | Kang Liu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Shun Wu | Di Wu | Wangtao Sun | Ziyang Huang | Xiaowei Yuan | Kun Luo | XueYou Zhang | Shizhu He | Jun Zhao | Kang Liu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
The paper presents KMatrix-2, an open-source toolkit that supports comprehensive heterogeneous knowledge collaborative enhancement for Large Language Models (LLMs). As the successor of KMatrix, our toolkit offers powerful modular components and typical enhancement patterns for convenient construction of mainstream knowledge-enhanced LLMs systems. Besides, it provides unified knowledge integration and joint knowledge retrieval methods to achieve more comprehensive heterogeneous knowledge collaborative enhancement. Compared with KMatrix which mainly focuses on descriptive knowledge, this work additionally considers procedural knowledge. Moreover, systematic inter-context and context-memory knowledge conflict resolution methods are offered for better knowledge integration. Some key research questions in heterogeneous knowledge-enhanced Large Language Models systems are analyzed, and our toolkit’s capability in building such systems is validated.
2024
KMatrix: A Flexible Heterogeneous Knowledge Enhancement Toolkit for Large Language Model
Shun Wu | Di Wu | Kun Luo | XueYou Zhang | Jun Zhao | Kang Liu
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Shun Wu | Di Wu | Kun Luo | XueYou Zhang | Jun Zhao | Kang Liu
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Knowledge-Enhanced Large Language Models (K-LLMs) system enhances Large Language Models (LLMs) abilities using external knowledge. Existing K-LLMs toolkits mainly focus on free-textual knowledge, lacking support for heterogeneous knowledge like tables and knowledge graphs, and fall short in comprehensive datasets, models, and user-friendly experience. To address this gap, we introduce KMatrix: a flexible heterogeneous knowledge enhancement toolkit for LLMs including verbalizing-retrieval and parsing-query methods. Our modularity and control-logic flow diagram design flexibly supports the entire lifecycle of various complex K-LLMs systems, including training, evaluation, and deployment. To assist K-LLMs system research, a series of related knowledge, datasets, and models are integrated into our toolkit, along with performance analyses of K-LLMs systems enhanced by different types of knowledge. Using our toolkit, developers can rapidly build, evaluate, and deploy their own K-LLMs systems.