Investigating the Shortcomings of LLMs in Step-by-Step Legal Reasoning

Venkatesh Mishra; Bimsara Pathiraja; Mihir Parmar; Sat Chidananda; Jayanth Srinivasa; Gaowen Liu; Ali Payani; Chitta Baral

doi:10.18653/v1/2025.findings-naacl.435

Investigating the Shortcomings of LLMs in Step-by-Step Legal Reasoning

Venkatesh Mishra, Bimsara Pathiraja, Mihir Parmar, Sat Chidananda, Jayanth Srinivasa, Gaowen Liu, Ali Payani, Chitta Baral

Abstract

Reasoning abilities of LLMs have been a key focus in recent years. One challenging reasoning domain with interesting nuances is legal reasoning, which requires careful application of rules, and precedents while balancing deductive and analogical reasoning, and conflicts between rules. Although there have been a few works on using LLMs for legal reasoning, their focus has been on overall accuracy. In this paper, we dig deeper to do a step-by-step analysis and figure out where they commit errors. We use the college-level Multiple Choice Question-Answering (MCQA) task from the Civil Procedure dataset and propose a new error taxonomy derived from initial manual analysis of reasoning chains with respect to several LLMs, including two objective measures: soundness and correctness scores. We then develop an LLM-based automated evaluation framework to identify reasoning errors and evaluate the performance of LLMs. The computation of soundness and correctness on the dataset using the auto-evaluator framework reveals several interesting insights. Furthermore, we show that incorporating the error taxonomy as feedback in popular prompting techniques marginally increases LLM performance. Our work will also serve as an evaluation framework that can be used in detailed error analysis of reasoning chains for logic-intensive complex tasks.

Anthology ID:: 2025.findings-naacl.435
Volume:: Findings of the Association for Computational Linguistics: NAACL 2025
Month:: April
Year:: 2025
Address:: Albuquerque, New Mexico
Editors:: Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 7810–7841
Language:
URL:: https://aclanthology.org/2025.findings-naacl.435/
DOI:: 10.18653/v1/2025.findings-naacl.435
Bibkey:
Cite (ACL):: Venkatesh Mishra, Bimsara Pathiraja, Mihir Parmar, Sat Chidananda, Jayanth Srinivasa, Gaowen Liu, Ali Payani, and Chitta Baral. 2025. Investigating the Shortcomings of LLMs in Step-by-Step Legal Reasoning. In Findings of the Association for Computational Linguistics: NAACL 2025, pages 7810–7841, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):: Investigating the Shortcomings of LLMs in Step-by-Step Legal Reasoning (Mishra et al., Findings 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.findings-naacl.435.pdf

PDF Cite Search Fix data