A RAG Chatbot with Incremental Context Retrieval based on Local LLMs for Hospital Documents

Murilo Vargas da Cunha, Marília Rosa Silveira, César Brasil Sperb, Larissa Astrogildo Freitas, Ulisses Brisolara Corrêa


Abstract
The adoption of LLMs in hospital environments demands solutions that ensure information security, computational efficiency, and rigorous control over sensitive institutional data. This work presents the development and evaluation of a chatbot based on RAG, using exclusively local LLMs, applied to internal documents of a university hospital in Portuguese, composed of Standard Operating Procedures and technical manuals. The methodology initially evaluates the quality of information retrieval through dense embedding models, measured by the Mean Reciprocal Rank (MRR) metric. Then, the generation stage is analyzed in two distinct scenarios: (i) RAG with fixed context, in which multiple chunks are provided simultaneously to the model, and (ii) Incremental page retrieval, in which chunks are sent sequentially according to the retrieval ranking. The generation assessment was conducted with four local LLMs — MedGemma3:27B, Gemma3:27B, Gpt-oss:20B, and Mistral Small 3.1 — using BERTScore as a quality metric. The results indicate that indiscriminate context increase in the fixed-context scenario degrades generation quality, even while increasing the probability of recovering the relevant chunk. In contrast, the incremental page retrieval technique showed improvements in BERTScore values, with the MedGemma3:27B model standing out with the best overall results. These findings demonstrate that adaptive context control is a critical factor in increasing the reliability and efficiency of RAG systems based on local LLMs in the healthcare domain.
Anthology ID:
2026.propor-2.14
Volume:
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 2
Month:
April
Year:
2026
Address:
Salvador, Brazil
Editors:
Marlo Souza, Iria de-Dios-Flores, Diana Santos, Larissa Freitas, Jackson Wilke da Cruz Souza, Eugénio Ribeiro
Venue:
PROPOR
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
68–77
Language:
URL:
https://aclanthology.org/2026.propor-2.14/
DOI:
Bibkey:
Cite (ACL):
Murilo Vargas da Cunha, Marília Rosa Silveira, César Brasil Sperb, Larissa Astrogildo Freitas, and Ulisses Brisolara Corrêa. 2026. A RAG Chatbot with Incremental Context Retrieval based on Local LLMs for Hospital Documents. In Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 2, pages 68–77, Salvador, Brazil. Association for Computational Linguistics.
Cite (Informal):
A RAG Chatbot with Incremental Context Retrieval based on Local LLMs for Hospital Documents (Cunha et al., PROPOR 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.propor-2.14.pdf