CogKTR: A Knowledge-Enhanced Text Representation Toolkit for Natural Language Understanding

Zhuoran Jin, Tianyi Men, Hongbang Yuan, Yuyang Zhou, Pengfei Cao, Yubo Chen, Zhipeng Xue, Kang Liu, Jun Zhao


Abstract
As the first step of modern natural language processing, text representation encodes discrete texts as continuous embeddings. Pre-trained language models (PLMs) have demonstrated strong ability in text representation and significantly promoted the development of natural language understanding (NLU). However, existing PLMs represent a text solely by its context, which is not enough to support knowledge-intensive NLU tasks. Knowledge is power, and fusing external knowledge explicitly into PLMs can provide knowledgeable text representations. Since previous knowledge-enhanced methods differ in many aspects, making it difficult for us to reproduce previous methods, implement new methods, and transfer between different methods. It is highly desirable to have a unified paradigm to encompass all kinds of methods in one framework. In this paper, we propose CogKTR, a knowledge-enhanced text representation toolkit for natural language understanding. According to our proposed Unified Knowledge-Enhanced Paradigm (UniKEP), CogKTR consists of four key stages, including knowledge acquisition, knowledge representation, knowledge injection, and knowledge application. CogKTR currently supports easy-to-use knowledge acquisition interfaces, multi-source knowledge embeddings, diverse knowledge-enhanced models, and various knowledge-intensive NLU tasks. Our unified, knowledgeable and modular toolkit is publicly available at GitHub, with an online system and a short instruction video.
Anthology ID:
2022.emnlp-demos.1
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Month:
December
Year:
2022
Address:
Abu Dhabi, UAE
Editors:
Wanxiang Che, Ekaterina Shutova
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–11
Language:
URL:
https://aclanthology.org/2022.emnlp-demos.1
DOI:
10.18653/v1/2022.emnlp-demos.1
Bibkey:
Cite (ACL):
Zhuoran Jin, Tianyi Men, Hongbang Yuan, Yuyang Zhou, Pengfei Cao, Yubo Chen, Zhipeng Xue, Kang Liu, and Jun Zhao. 2022. CogKTR: A Knowledge-Enhanced Text Representation Toolkit for Natural Language Understanding. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 1–11, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
CogKTR: A Knowledge-Enhanced Text Representation Toolkit for Natural Language Understanding (Jin et al., EMNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.emnlp-demos.1.pdf