Brenda Salenave Santana

2026

Avaliação End-to-End de um Sistema RAG para Documentos Hospitalares em Português
Murilo Vargas da Cunha | Marília Rosa Silveira | César Brasil Sperb | Brenda Salenave Santana | Larissa Astrogildo Freitas | Ulisses Brisolara Corrêa
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1

Este artigo avalia um sistema end-to-end de Geração Aumentada por Recuperação (RAG) para consulta a documentos hospitalares regulatórios em português. O estudo analisa o impacto da otimização de cada componente (recuperação, reclassificação e geração) em um cenário de recursos limitados. A metodologia combinou a criação de um dataset híbrido (sintético e validado por especialistas) com avaliações quantitativas utilizando métricas como MRR, NDCG@10 e BERTScore. Os resultados demonstram que o modelo de embedding intfloat/multilingual-e5-small apresentou a maior robustez, com taxa de falha de apenas 1,4% na recuperação. Na etapa de reclassificação, o método RRF destacou-se pelo equilíbrio entre custo computacional e desempenho. Conclui-se que a arquitetura otimizada, integrando esses componentes ao gerador Gemini 2.5 Flash, oferece uma solução eficiente e precisa para suporte à decisão em ambientes hospitalares.

pdf bib abs

Modeling Linguistic Violence: An Ontology-Based Framework for the Computational Analysis of Violence Manifested in Language
Brenda Salenave Santana | Ana Marilza Pernas | Aline A. Vanin
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1

The conceptual ambiguity among terms like ’hate speech’, ’toxic speech’, and ’dangerous speech’ creates a significant bottleneck for both research and automated moderation. Traditional NLP models, often focused on lexical cues, struggle to differentiate these nuanced forms of linguistic violence, especially when the harm is implicit. This paper addresses this gap with a twofold objective. First, we conduct a conceptual review and propose a unified ontology that differentiates these concepts—including verbal aggression and cyberbullying—based on their core attributes, such as their target, intent, and associated rhetorical hallmarks. Second, we propose a computational methodology designed to operationalize this ontology. Our framework uses a multi-stage NLP pipeline that leverages semantic analysis, specifically Semantic Role Labeling and Named Entity Recognition, to deconstruct speech acts into their core components (e.g., target and action). This component-based approach allows for a granular classification that can robustly distinguish between seemingly similar phenomena, such as a general insult and a targeted identity-based attack. This methodology is particularly promising for low-resource languages, such as Portuguese, as it relies on core semantic tasks for which multilingual models are available, rather than requiring massive, task-specific labeled datasets.

Co-authors

César Brasil Sperb 1

Aline A. Vanin 1

Venues

PROPOR2

Fix author