Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs

Siyuan Wang, Zhongyu Wei, Yejin Choi, Xiang Ren


Abstract
Large language models (LLMs) have achieved impressive human-like performance across various reasoning tasks. However, their mastery of underlying inferential rules still falls short of human capabilities. To investigate this, we propose a logic scaffolding inferential rule generation framework, to construct an inferential rule base, ULogic, comprising both primitive and compositional rules across five domains. Our analysis of GPT-series models over a rule subset reveals significant gaps in LLMs’ logic understanding compared to human performance, especially in compositional and structural complex rules with certain bias patterns. We further distill these rules into a smaller-scale inference engine for flexible rule generation and enhancing downstream reasoning. Through a multi-judger evaluation, our inference engine proves effective in generating accurate, complex and abstract conclusions and premises, and improve various commonsense reasoning tasks. Overall, our work sheds light on LLMs’ limitations in grasping inferential rule and suggests ways to enhance their logical reasoning abilities .
Anthology ID:
2024.luhme-long.406
Volume:
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7523–7543
Language:
URL:
https://aclanthology.org/2024.luhme-long.406/
DOI:
10.18653/v1/2024.acl-long.406
Bibkey:
Cite (ACL):
Siyuan Wang, Zhongyu Wei, Yejin Choi, and Xiang Ren. 2024. Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7523–7543, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs (Wang et al., ACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.acl-long.406.pdf