Grounded in Law: A Multi-Stage Anti-Hallucination Pipeline for Legal RAG Systems in Brazilian Portuguese

Arla Figueiredo, João Lucas, Tatiana Ribeiro, Caio Nery, Alan Rios, Caio Hebert, Luiza Florentino, Arthur Silva, Ícaro Feyerabend, Pedro Vidal, Bruno Cabral


Abstract
Large Language Models (LLMs) are effective text generators but create legal citations at non-trivial rates, a failure mode with serious consequences in legal practice. In Brazilian Portuguese the risk is amplified by citation variability (juridiquês), fragment-level references (article → paragraph → item), and the need to distinguish jurisdictions and court instances.We describe a production Retrieval-Augmented Generation (RAG) system deployed at a Brazilian legal-technology platform. The system combines (1) domain-tuned hybrid retrieval (lexical, dense, and cross-encoder reranking) over a large-scale legal corpus; (2) grounded generation with explicit citation constraints; and (3) a post-generation Reference Audit layer that extracts legislation and jurisprudence mentions via specialized taggers, normalizes them to a canonical schema, checks existence against authoritative databases at fragment granularity, verifies fidelity against official texts, and triggers targeted rewrites when inconsistencies are detected.We report production telemetry from 184,895 audited answers containing 43,175 extracted legal references. Legislation references resolve at 81.7%, while jurisprudence references resolve at only 47.1%, identifying case-law normalization as the primary bottleneck for practitioners. Fidelity verification corrected 6.5% of checked answers before delivery, preventing misrepresented legal claims from reaching end users. By converting silent hallucinations into explicit warnings with per-reference status, the system enables legal professionals to trust verified citations and efficiently review flagged ones, rather than manually checking every authority.
Anthology ID:
2026.propor-2.9
Volume:
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 2
Month:
April
Year:
2026
Address:
Salvador, Brazil
Editors:
Marlo Souza, Iria de-Dios-Flores, Diana Santos, Larissa Freitas, Jackson Wilke da Cruz Souza, Eugénio Ribeiro
Venue:
PROPOR
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
30–34
Language:
URL:
https://aclanthology.org/2026.propor-2.9/
DOI:
Bibkey:
Cite (ACL):
Arla Figueiredo, João Lucas, Tatiana Ribeiro, Caio Nery, Alan Rios, Caio Hebert, Luiza Florentino, Arthur Silva, Ícaro Feyerabend, Pedro Vidal, and Bruno Cabral. 2026. Grounded in Law: A Multi-Stage Anti-Hallucination Pipeline for Legal RAG Systems in Brazilian Portuguese. In Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 2, pages 30–34, Salvador, Brazil. Association for Computational Linguistics.
Cite (Informal):
Grounded in Law: A Multi-Stage Anti-Hallucination Pipeline for Legal RAG Systems in Brazilian Portuguese (Figueiredo et al., PROPOR 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.propor-2.9.pdf