Beatriz Canaverde
2026
AMALIA: A Fully Open Large Language Model for European Portuguese
Afonso Simplício | Gonçalo Vinagre | Miguel Moura Ramos | Diogo Tavares | Rafael Ferreira | Giuseppe Attanasio | Duarte M. Alves | Inês Calvo | Inês Vieira | Rui Guerra | James Furtado | Beatriz Canaverde | Iago Paulo | Vasco Ramos | Diogo Glória-Silva | Miguel Faria | Marcos Treviso | Daniel Gomes | Pedro Gomes | David Semedo | André Martins | João Magalhães
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
Afonso Simplício | Gonçalo Vinagre | Miguel Moura Ramos | Diogo Tavares | Rafael Ferreira | Giuseppe Attanasio | Duarte M. Alves | Inês Calvo | Inês Vieira | Rui Guerra | James Furtado | Beatriz Canaverde | Iago Paulo | Vasco Ramos | Diogo Glória-Silva | Miguel Faria | Marcos Treviso | Daniel Gomes | Pedro Gomes | David Semedo | André Martins | João Magalhães
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
Despite rapid progress in open large language models (LLMs), European Portuguese (pt-PT) remains underrepresented in both training data and native evaluation, with machine-translated benchmarks likely missing the variant’s linguistic and cultural nuances. We introduce AMALIA, a fully open LLM that prioritizes pt-PT by using more high-quality pt-PT data during both the mid- and post-training stages. To evaluate pt-PT more faithfully, we release a suite of pt-PT benchmarks that includes translated standard tasks and four new datasets targeting pt-PT generation, linguistic competence, and pt-PT/pt-BR bias. Experiments show that AMALIA matches strong baselines on translated benchmarks while substantially improving performance on pt-PT-specific evaluations, supporting the case for targeted training and native benchmarking for European Portuguese.
MATH-PT: A Math Reasoning Benchmark for European and Brazilian Portuguese
Tiago Teixeira | Ana Carolina Erthal | Juan Belieni | Beatriz Canaverde | Diego Mesquita | Miguel Faria | Eliezer de Souza da Silva | André F. T. Martins
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
Tiago Teixeira | Ana Carolina Erthal | Juan Belieni | Beatriz Canaverde | Diego Mesquita | Miguel Faria | Eliezer de Souza da Silva | André F. T. Martins
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
The use of large language models (LLMs) for complex mathematical reasoning is an emergent area of research, with fast progress in methods, models, and benchmark datasets. However, most mathematical reasoning evaluations exhibit a significant linguistic bias, with the vast majority of benchmark datasets being exclusively in English or (at best) translated from English. We address this limitation by introducing MATH-PT, a novel dataset comprising 1,729 mathematical problems written in European and Brazilian Portuguese. MATH-PT is curated from a variety of high-quality native sources, including mathematical Olympiads, competitions, and exams from Portugal and Brazil. We present a comprehensive benchmark of current state-of-the-art LLMs on MATHPT, revealing that frontier reasoning models achieve strong performance in multiple choice questions compared to open weight models, but that their performance decreases for questions with figures or open-ended questions. To facilitate future research, we release the benchmark dataset and model outputs
Search
Fix author
Co-authors
- Miguel Faria 2
- André F. T. Martins 2
- Duarte M. Alves 1
- Giuseppe Attanasio 1
- Juan Belieni 1
- Inês Calvo 1
- Ana Carolina Erthal 1
- Rafael Ferreira 1
- James Furtado 1
- Diogo Glória-Silva 1
- Daniel Gomes 1
- Pedro Gomes 1
- Rui Guerra 1
- João Magalhães 1
- Diego Mesquita 1
- Iago Paulo 1
- Miguel Moura Ramos 1
- Vasco Ramos 1
- David Semedo 1
- Eliezer de Souza da Silva 1
- Afonso Simplício 1
- Diogo Tavares 1
- Tiago Teixeira 1
- Marcos Treviso 1
- Inês Vieira 1
- Gonçalo Vinagre 1