Cross Examine: An Ensemble-based approach to leverage Large Language Models for Legal Text Analytics

Saurav Chowdhury; Suyog Joshi; Lipika Dey

doi:10.18653/v1/2024.nllp-1.16

Cross Examine: An Ensemble-based approach to leverage Large Language Models for Legal Text Analytics

Saurav Chowdhury, Suyog Joshi, Lipika Dey

Abstract

Legal documents are complex in nature, describing a course of argumentative reasoning that is followed to settle a case. Churning through large volumes of legal documents is a daily requirement for a large number of professionals who need access to the information embedded in them. Natural language processing methods that help in document summarization with key information components, insight extraction and question answering play a crucial role in legal text processing. Most of the existing document analysis systems use supervised machine learning, which require large volumes of annotated training data for every different application and are expensive to build. In this paper we propose a legal text analytics pipeline using Large Language Models (LLM), which can work with little or no training data. For document summarization, we propose an iterative pipeline using retrieval augmented generation to ensure that the generated text remains contextually relevant. For question answering, we propose a novel ontology-driven ensemble approach similar to cross-examination that exploits questioning and verification principles. A knowledge graph, created with the extracted information, stores the key entities and relationships reflecting the repository content structure. A new dataset is created with Indian court documents related to bail applications for cases filed under Protection of Children from Sexual Offences (POCSO) Act, 2012 an Indian law to protect children from sexual abuse and offences. Analysis of insights extracted from the answers reveal patterns of crime and social conditions leading to those crimes, which are important inputs for social scientists as well as legal system.

Anthology ID:: 2024.nllp-1.16
Volume:: Proceedings of the Natural Legal Language Processing Workshop 2024
Month:: November
Year:: 2024
Address:: Miami, FL, USA
Editors:: Nikolaos Aletras, Ilias Chalkidis, Leslie Barrett, Cătălina Goanță, Daniel Preoțiuc-Pietro, Gerasimos Spanakis
Venues:: NLLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 194–204
Language:
URL:: https://aclanthology.org/2024.nllp-1.16/
DOI:: 10.18653/v1/2024.nllp-1.16
Bibkey:
Cite (ACL):: Saurav Chowdhury, Suyog Joshi, and Lipika Dey. 2024. Cross Examine: An Ensemble-based approach to leverage Large Language Models for Legal Text Analytics. In Proceedings of the Natural Legal Language Processing Workshop 2024, pages 194–204, Miami, FL, USA. Association for Computational Linguistics.
Cite (Informal):: Cross Examine: An Ensemble-based approach to leverage Large Language Models for Legal Text Analytics (Chowdhury et al., NLLP 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.nllp-1.16.pdf

PDF Cite Search Fix data