Discovery of Legal Patterns in Civil Petitions via LLM-Based Fact Extraction and Density Clustering

Rhedson Esashika; Carlos M. S. Figueiredo; Tiago de Melo

Discovery of Legal Patterns in Civil Petitions via LLM-Based Fact Extraction and Density Clustering

Rhedson Esashika, Carlos M. S. Figueiredo, Tiago de Melo

Abstract

The analysis of unstructured civil petitions is often hindered by procedural noise and verbose argumentation. To address this, we propose a pipeline composed of LLM-based fact extraction followed by legal-domain embeddings of texts for unsupervised density clustering. We employ Large Language Models to isolate factual narratives from raw texts, which are then encoded using domain-specific representations (Legal-BERT) and grouped via UMAP dimensionality reduction and the HDBSCAN algorithm. Comparative experiments on a Brazilian judicial corpus reveal that clustering based solely on extracted yields significantly more cohesive and semantically well-defined groups than, which suffer from fragmentation due to content variability. Results indicate that the proposed method is a promising approach for thematic organization, procedural triage support, and large-scale discovery of legal patterns.

Anthology ID:: 2026.propor-1.41
Volume:: Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
Month:: April
Year:: 2026
Address:: Salvador, Brazil
Editors:: Marlo Souza, Iria de-Dios-Flores, Diana Santos, Larissa Freitas, Jackson Wilke da Cruz Souza, Eugénio Ribeiro
Venue:: PROPOR
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 416–424
Language:
URL:: https://aclanthology.org/2026.propor-1.41/
DOI:
Bibkey:
Cite (ACL):: Rhedson Esashika, Carlos M. S. Figueiredo, and Tiago de Melo. 2026. Discovery of Legal Patterns in Civil Petitions via LLM-Based Fact Extraction and Density Clustering. In Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1, pages 416–424, Salvador, Brazil. Association for Computational Linguistics.
Cite (Informal):: Discovery of Legal Patterns in Civil Petitions via LLM-Based Fact Extraction and Density Clustering (Esashika et al., PROPOR 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.propor-1.41.pdf

PDF Cite Search Fix data