Development and Evaluation of a Hybrid Information Retrieval System Applied to the Brazilian Legal Domain

Ana Carolina C. Bessa, Fábio M. F. Lobato, Antonio F. L. J. Junior


Abstract
The need for tools that assist in process management, automating tasks and reducing the slowness of the judicial system, justifies the improvement of traditional Information Retrieval systems, often limited by vocabulary incompatibility and the length of legal texts. Although models based on Transformers capture semantic particularities, they face input size constraints that make it difficult to process long texts without losing information. In this work, we propose a hybrid system applied to the legal domain, combining the BM25L algorithm and the BumbaLM language model.
Anthology ID:
2026.propor-2.26
Volume:
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 2
Month:
April
Year:
2026
Address:
Salvador, Brazil
Editors:
Marlo Souza, Iria de-Dios-Flores, Diana Santos, Larissa Freitas, Jackson Wilke da Cruz Souza, Eugénio Ribeiro
Venue:
PROPOR
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
186–190
Language:
URL:
https://aclanthology.org/2026.propor-2.26/
DOI:
Bibkey:
Cite (ACL):
Ana Carolina C. Bessa, Fábio M. F. Lobato, and Antonio F. L. J. Junior. 2026. Development and Evaluation of a Hybrid Information Retrieval System Applied to the Brazilian Legal Domain. In Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 2, pages 186–190, Salvador, Brazil. Association for Computational Linguistics.
Cite (Informal):
Development and Evaluation of a Hybrid Information Retrieval System Applied to the Brazilian Legal Domain (Bessa et al., PROPOR 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.propor-2.26.pdf