FlexQwen: Exploring Hybrid Objectives and Text Originality for Portuguese

Miguel de Mello Carpi, Marcelo Finger


Abstract
While scaling laws suggest increasing model and dataset sizes for better results, efficient pre-training techniques for low-resource scenarios present unique challenges that require further investigation. This work introduces FlexQwen, a model based on the Qwen 3 architecture adapted for a hybrid causal-masked objective, and the Carolina Originality dataset, a subset of the Corpus Carolina tailored for efficient pre-training in Portuguese. We investigate two primary research questions: the influence of hybrid masked-causal modelling and the impact of text originality on model performance. Our experiments compare a high-originality Gold split against a length-matched control group. Results indicate that hybrid objectives may be viable for efficient training. Furthermore, we provide open access to our code, datasets, and training logs to foster further research in efficient Portuguese LLMs.
Anthology ID:
2026.propor-1.114
Volume:
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
Month:
April
Year:
2026
Address:
Salvador, Brazil
Editors:
Marlo Souza, Iria de-Dios-Flores, Diana Santos, Larissa Freitas, Jackson Wilke da Cruz Souza, Eugénio Ribeiro
Venue:
PROPOR
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1079–1084
Language:
URL:
https://aclanthology.org/2026.propor-1.114/
DOI:
Bibkey:
Cite (ACL):
Miguel de Mello Carpi and Marcelo Finger. 2026. FlexQwen: Exploring Hybrid Objectives and Text Originality for Portuguese. In Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1, pages 1079–1084, Salvador, Brazil. Association for Computational Linguistics.
Cite (Informal):
FlexQwen: Exploring Hybrid Objectives and Text Originality for Portuguese (Carpi & Finger, PROPOR 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.propor-1.114.pdf