Miguel de Mello Carpi
Also published as: Miguel de Mello Carpi
2026
FlexQwen: Exploring Hybrid Objectives and Text Originality for Portuguese
Miguel de Mello Carpi | Marcelo Finger
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
Miguel de Mello Carpi | Marcelo Finger
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
While scaling laws suggest increasing model and dataset sizes for better results, efficient pre-training techniques for low-resource scenarios present unique challenges that require further investigation. This work introduces FlexQwen, a model based on the Qwen 3 architecture adapted for a hybrid causal-masked objective, and the Carolina Originality dataset, a subset of the Corpus Carolina tailored for efficient pre-training in Portuguese. We investigate two primary research questions: the influence of hybrid masked-causal modelling and the impact of text originality on model performance. Our experiments compare a high-originality Gold split against a length-matched control group. Results indicate that hybrid objectives may be viable for efficient training. Furthermore, we provide open access to our code, datasets, and training logs to foster further research in efficient Portuguese LLMs.
2024
Exploring Computational Discernibility of Discourse Domains in Brazilian Portuguese within the Carolina Corpus
Felipe Ribas Serras | Mariana Sturzeneker | Miguel de Mello Carpi | Mayara Feliciano Palma | Maria Clara Ramos Morales Crespo | Aline Silva Costa | Vanessa Martins Do Monte | Cristiane Namiuti | Maria Clara Paixão de Souza | Marcelo Finger
Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 1
Felipe Ribas Serras | Mariana Sturzeneker | Miguel de Mello Carpi | Mayara Feliciano Palma | Maria Clara Ramos Morales Crespo | Aline Silva Costa | Vanessa Martins Do Monte | Cristiane Namiuti | Maria Clara Paixão de Souza | Marcelo Finger
Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 1