SLR: Automated Synthesis for Scalable Logical Reasoning

Lukas Helff; Ahmad Omar; Felix Friedrich; Antonia Wüst; Hikaru Shindo; Rupert Mitchell; Tim Woydt; Patrick Schramowski; Wolfgang Stammer; Kristian Kersting

SLR: Automated Synthesis for Scalable Logical Reasoning

Lukas Helff, Ahmad Omar, Felix Friedrich, Antonia Wüst, Hikaru Shindo, Rupert Mitchell, Tim Woydt, Patrick Schramowski, Wolfgang Stammer, Kristian Kersting

Abstract

We introduce SLR, an end-to-end framework for systematic evaluation and training of Large Language Models (LLMs) via Scalable Logical Reasoning. Given a user’s task specification, SLR automatically synthesizes (i) an instruction prompt for an inductive reasoning task, (ii) a validation program, executable on model outputs to provide verifiable rewards, and (iii) the latent ground-truth rule. This process is fully automated, scalable, requires no human annotations, and offers precise control over task difficulty. Using SLR, we create SLR-Bench, a benchmark comprising 19k prompts organized into 20 curriculum levels that progressively increase in relational, arithmetic, and recursive complexity. Large-scale evaluation reveals that contemporary LLMs readily produce syntactically valid rules, yet often fail at correct logical inference. Recent reasoning LLMs demonstrate improved performance but incur very high test-time computation, with costs exceeding 300 for just 1,000 prompts. Finally, curriculum learning via SLR doubles Llama-3-8B accuracy on SLR-Bench, achieving parity with Gemini-Flash-Thinking at a fraction of computational cost. Moreover, these reasoning capabilities generalize to a wide range of established benchmarks, underscoring the effectiveness of SLR for downstream reasoning.

Anthology ID:: 2026.acl-long.16
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 402–426
Language:
URL:: https://aclanthology.org/2026.acl-long.16/
DOI:
Bibkey:
Cite (ACL):: Lukas Helff, Ahmad Omar, Felix Friedrich, Antonia Wüst, Hikaru Shindo, Rupert Mitchell, Tim Woydt, Patrick Schramowski, Wolfgang Stammer, and Kristian Kersting. 2026. SLR: Automated Synthesis for Scalable Logical Reasoning. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 402–426, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: SLR: Automated Synthesis for Scalable Logical Reasoning (Helff et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-long.16.pdf
Checklist:: 2026.acl-long.16.checklist.pdf

PDF Cite Search Checklist Fix data