Exploring Self-supervised Logic-enhanced Training for Large Language Models

Fangkai Jiao; Zhiyang Teng; Bosheng Ding; Zhengyuan Liu; Nancy Chen; Shafiq Joty

doi:10.18653/v1/2024.naacl-long.53

Exploring Self-supervised Logic-enhanced Training for Large Language Models

Fangkai Jiao, Zhiyang Teng, Bosheng Ding, Zhengyuan Liu, Nancy Chen, Shafiq Joty

Abstract

Traditional attempts to enhance the logical reasoning abilities of language models often rely on supervised fine-tuning, limiting their generalization to new tasks or domains. Large Language Models (LLMs), with their capacity to condense vast knowledge, can effectively tackle many tasks. Yet, our experiments reveal a gap in their performance on logical reasoning benchmarks when compared to state-of-the-art fine-tuning based models. To bridge this gap, we present LogicLLM, a first-of-its-kind, fully self-supervised framework for integrating logical reasoning capabilities into LLMs, and activating them via in-context learning. We apply this to two LLM series, FLAN-T5 and LLaMA, with parameter sizes from 3 billion to 33 billion. LogicLLM demonstrates its effectiveness through successful improvements on two logical reasoning benchmarks (ReClor and LogiQA-v2). Additionally, LogicLLM based on FLAN-T5-11B attains comparable results to ChatGPT, and evaluations with LLaMA-based models on three language understanding benchmarks (RACE, MMLU and Big-Bench-Hard) confirm that the improvements come without compromising the model’s general language understanding capabilities.

Anthology ID:: 2024.naacl-long.53
Original:: 2024.naacl-long.53v1
Version 2:: 2024.naacl-long.53v2
Volume:: Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:: June
Year:: 2024
Address:: Mexico City, Mexico
Editors:: Kevin Duh, Helena Gomez, Steven Bethard
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 926–941
Language:
URL:: https://aclanthology.org/2024.naacl-long.53/
DOI:: 10.18653/v1/2024.naacl-long.53
Bibkey:
Cite (ACL):: Fangkai Jiao, Zhiyang Teng, Bosheng Ding, Zhengyuan Liu, Nancy Chen, and Shafiq Joty. 2024. Exploring Self-supervised Logic-enhanced Training for Large Language Models. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 926–941, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):: Exploring Self-supervised Logic-enhanced Training for Large Language Models (Jiao et al., NAACL 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.naacl-long.53.pdf
Video:: https://aclanthology.org/2024.naacl-long.53.mp4