Felipe S. F. Paula
2026
The PROPOR Ecosystem: Structure, Roles, and Evolution of Portuguese-Language NLP
Rafael O. Nunes | Gustavo L. Tamiosso | Pedro L. C. de Andrade | Matheus S. de Aguiar | Rafael P. de Gouveia | Higor Moreira | Bruno Tavares | Laura P. de Gouveia | Felipe S. F. Paula | Andre Spritzer | Hidelberg O. Albuquerque | Nádia F. F. da Silva | Ellen P. R. S. Pereira | Dennis G. Balreira | Joel L. Carbonera
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
Rafael O. Nunes | Gustavo L. Tamiosso | Pedro L. C. de Andrade | Matheus S. de Aguiar | Rafael P. de Gouveia | Higor Moreira | Bruno Tavares | Laura P. de Gouveia | Felipe S. F. Paula | Andre Spritzer | Hidelberg O. Albuquerque | Nádia F. F. da Silva | Ellen P. R. S. Pereira | Dennis G. Balreira | Joel L. Carbonera
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
The PROPOR conference has been the main venue for Portuguese language Natural Language Processing (NLP) research for over two decades. This paper presents a longitudinal bibliometric analysis of PROPOR from 2003 to 2024, examining thematic evolution, community structure, and scientific impact. We identify a shift from speech-oriented research toward text-based tasks, alongside the sustained importance of resources and linguistic theory. The community exhibits a stable structure, with complementary leadership models centered on institutional hubs and brokerage roles. Scientific impact is highly concentrated, following a long tail distribution, and distinguishes between cumulative productivity-driven impact and rapidly accelerating citation uptake in recent editions. These findings characterize PROPOR as a resilient regional linguistic ecosystem evolving in dialogue with broader NLP paradigms.
Negation-Aware Data Augmentation for Portuguese Natural Language Inference
Maria Cecília M. Corrêa | Felipe S. F. Paula | Matheus Westhelle | Viviane P. Moreira
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
Maria Cecília M. Corrêa | Felipe S. F. Paula | Matheus Westhelle | Viviane P. Moreira
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
Negation plays a fundamental role in human communication and logical reasoning, yet it remains underrepresented in natural language inference (NLI) datasets. This work investigates the impact of targeted data augmentation using negation cues on the main NLI datasets for Portuguese (InferBR, ASSIN and ASSIN2). By synthetically generating new instances with negated hypotheses, we create more diverse training and test sets. A BERT-based model was fine-tuned and tested on the combined datasets and augmented data. The results show that the model was heavily influenced by the bias in the use of negation, and increased data diversity improves the model’s handling of negation.