Language Models over Large-Scale Knowledge Base: on Capacity, Flexibility and Reasoning for New Facts

Qiyuan He, Yizhong Wang, Jianfei Yu, Wenya Wang


Abstract
Advancements in language models (LMs) have sparked interest in exploring their potential as knowledge bases (KBs) due to their high capability for storing huge amounts of factual knowledge and semantic understanding. However, existing studies face challenges in quantifying the extent of large-scale knowledge packed into LMs and lack systematic studies on LMs’ structured reasoning capabilities over the infused knowledge. Addressing these gaps, our research investigates whether LMs can effectively act as large-scale KBs after training over an expansive set of world knowledge triplets via addressing the following three crucial questions: (1) How do LMs of different sizes perform at storing world knowledge of different frequencies in a large-scale KB? (2) How flexible are these LMs in recalling the stored knowledge when prompted with natural language queries? (3) After training on the abundant world knowledge, can LMs additionally gain the ability to reason over such information to infer new facts? Our findings indicate that while medium-scaled LMs hold promise as world knowledge bases capable of storing and responding with flexibility, enhancements in their reasoning capabilities are necessary to fully realize their potential.
Anthology ID:
2025.coling-main.118
Volume:
Proceedings of the 31st International Conference on Computational Linguistics
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1736–1753
Language:
URL:
https://aclanthology.org/2025.coling-main.118/
DOI:
Bibkey:
Cite (ACL):
Qiyuan He, Yizhong Wang, Jianfei Yu, and Wenya Wang. 2025. Language Models over Large-Scale Knowledge Base: on Capacity, Flexibility and Reasoning for New Facts. In Proceedings of the 31st International Conference on Computational Linguistics, pages 1736–1753, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Language Models over Large-Scale Knowledge Base: on Capacity, Flexibility and Reasoning for New Facts (He et al., COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-main.118.pdf