Caio Hebert

2026

Large Language Models (LLMs) are effective text generators but create legal citations at non-trivial rates, a failure mode with serious consequences in legal practice. In Brazilian Portuguese the risk is amplified by citation variability (juridiquês), fragment-level references (article → paragraph → item), and the need to distinguish jurisdictions and court instances.We describe a production Retrieval-Augmented Generation (RAG) system deployed at a Brazilian legal-technology platform. The system combines (1) domain-tuned hybrid retrieval (lexical, dense, and cross-encoder reranking) over a large-scale legal corpus; (2) grounded generation with explicit citation constraints; and (3) a post-generation Reference Audit layer that extracts legislation and jurisprudence mentions via specialized taggers, normalizes them to a canonical schema, checks existence against authoritative databases at fragment granularity, verifies fidelity against official texts, and triggers targeted rewrites when inconsistencies are detected.We report production telemetry from 184,895 audited answers containing 43,175 extracted legal references. Legislation references resolve at 81.7%, while jurisprudence references resolve at only 47.1%, identifying case-law normalization as the primary bottleneck for practitioners. Fidelity verification corrected 6.5% of checked answers before delivery, preventing misrepresented legal claims from reaching end users. By converting silent hallucinations into explicit warnings with per-reference status, the system enables legal professionals to trust verified citations and efficiently review flagged ones, rather than manually checking every authority.

Co-authors

Caio Nery 1

Venues

PROPOR1

Fix author