Ana Carolina C. Bessa
2026
Development and Evaluation of a Hybrid Information Retrieval System Applied to the Brazilian Legal Domain
Ana Carolina C. Bessa | Fábio M. F. Lobato | Antonio F. L. J. Junior
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 2
Ana Carolina C. Bessa | Fábio M. F. Lobato | Antonio F. L. J. Junior
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 2
The need for tools that assist in process management, automating tasks and reducing the slowness of the judicial system, justifies the improvement of traditional Information Retrieval systems, often limited by vocabulary incompatibility and the length of legal texts. Although models based on Transformers capture semantic particularities, they face input size constraints that make it difficult to process long texts without losing information. In this work, we propose a hybrid system applied to the legal domain, combining the BM25L algorithm and the BumbaLM language model.