Shun Wu


2024

pdf bib
KMatrix: A Flexible Heterogeneous Knowledge Enhancement Toolkit for Large Language Model
Shun Wu | Di Wu | Kun Luo | XueYou Zhang | Jun Zhao | Kang Liu
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

Knowledge-Enhanced Large Language Models (K-LLMs) system enhances Large Language Models (LLMs) abilities using external knowledge. Existing K-LLMs toolkits mainly focus on free-textual knowledge, lacking support for heterogeneous knowledge like tables and knowledge graphs, and fall short in comprehensive datasets, models, and user-friendly experience. To address this gap, we introduce KMatrix: a flexible heterogeneous knowledge enhancement toolkit for LLMs including verbalizing-retrieval and parsing-query methods. Our modularity and control-logic flow diagram design flexibly supports the entire lifecycle of various complex K-LLMs systems, including training, evaluation, and deployment. To assist K-LLMs system research, a series of related knowledge, datasets, and models are integrated into our toolkit, along with performance analyses of K-LLMs systems enhanced by different types of knowledge. Using our toolkit, developers can rapidly build, evaluate, and deploy their own K-LLMs systems.

2021

pdf bib
Named Entity Recognition via Noise Aware Training Mechanism with Data Filter
Xiusheng Huang | Yubo Chen | Shun Wu | Jun Zhao | Yuantao Xie | Weijian Sun
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021