Wanting Wen
2025
Evaluating Generalization Capability of Language Models across Abductive, Deductive and Inductive Logical Reasoning
Yu Sheng
|
Wanting Wen
|
Linjing Li
|
Daniel Zeng
Proceedings of the 31st International Conference on Computational Linguistics
Transformer-based language models (LMs) have demonstrated remarkable performance on many natural language tasks, yet to what extent LMs possess the capability of generalizing to unseen logical rules remains not explored sufficiently. In classical logic category, abductive, deductive and inductive (ADI) reasoning are defined as the fundamental reasoning types, sharing the identical reasoning primitives and properties, and some research have proposed that there exists mutual generalization across them. However, in the field of natural language processing, previous research generally study LMs’ ADI reasoning capabilities separately, overlooking the generalization across them. To bridge this gap, we propose UniADILR, a novel logical reasoning dataset crafted for assessing the generalization capabilities of LMs across different logical rules. Based on UniADILR, we conduct extensive investigations from various perspectives of LMs’ performance on ADI reasoning. The experimental results reveal the weakness of current LMs in terms of extrapolating to unseen rules and inspire a new insight for future research in logical reasoning.
2024
Unveiling Factual Recall Behaviors of Large Language Models through Knowledge Neurons
Yifei Wang
|
Yuheng Chen
|
Wanting Wen
|
Yu Sheng
|
Linjing Li
|
Daniel Dajun Zeng
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
In this paper, we investigate whether Large Language Models (LLMs) actively recall or retrieve their internal repositories of factual knowledge when faced with reasoning tasks. Through an analysis of LLMs’ internal factual recall at each reasoning step via Knowledge Neurons, we reveal that LLMs fail to harness the critical factual associations under certain circumstances. Instead, they tend to opt for alternative, shortcut-like pathways to answer reasoning questions. By manually manipulating the recall process of parametric knowledge in LLMs, we demonstrate that enhancing this recall process directly improves reasoning performance whereas suppressing it leads to notable degradation. Furthermore, we assess the effect of Chain-of-Thought (CoT) prompting, a powerful technique for addressing complex reasoning tasks. Our findings indicate that CoT can intensify the recall of factual knowledge by encouraging LLMs to engage in orderly and reliable reasoning. Furthermore, we explored how contextual conflicts affect the retrieval of facts during the reasoning process to gain a comprehensive understanding of the factual recall behaviors of LLMs. Code and data will be available soon.
Search
Fix data
Co-authors
- Linjing Li 2
- Yu Sheng 2
- Yuheng Chen 1
- Yifei Wang 1
- Daniel Dajun Zeng 1
- show all...