Vasco Ramos
2026
AMALIA: A Fully Open Large Language Model for European Portuguese
Afonso Simplício | Gonçalo Vinagre | Miguel Moura Ramos | Diogo Tavares | Rafael Ferreira | Giuseppe Attanasio | Duarte M. Alves | Inês Calvo | Inês Vieira | Rui Guerra | James Furtado | Beatriz Canaverde | Iago Paulo | Vasco Ramos | Diogo Glória-Silva | Miguel Faria | Marcos Treviso | Daniel Gomes | Pedro Gomes | David Semedo | André Martins | João Magalhães
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
Afonso Simplício | Gonçalo Vinagre | Miguel Moura Ramos | Diogo Tavares | Rafael Ferreira | Giuseppe Attanasio | Duarte M. Alves | Inês Calvo | Inês Vieira | Rui Guerra | James Furtado | Beatriz Canaverde | Iago Paulo | Vasco Ramos | Diogo Glória-Silva | Miguel Faria | Marcos Treviso | Daniel Gomes | Pedro Gomes | David Semedo | André Martins | João Magalhães
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
Despite rapid progress in open large language models (LLMs), European Portuguese (pt-PT) remains underrepresented in both training data and native evaluation, with machine-translated benchmarks likely missing the variant’s linguistic and cultural nuances. We introduce AMALIA, a fully open LLM that prioritizes pt-PT by using more high-quality pt-PT data during both the mid- and post-training stages. To evaluate pt-PT more faithfully, we release a suite of pt-PT benchmarks that includes translated standard tasks and four new datasets targeting pt-PT generation, linguistic competence, and pt-PT/pt-BR bias. Experiments show that AMALIA matches strong baselines on translated benchmarks while substantially improving performance on pt-PT-specific evaluations, supporting the case for targeted training and native benchmarking for European Portuguese.
2024
Generating Coherent Sequences of Visual Illustrations for Real-World Manual Tasks
João Bordalo | Vasco Ramos | Rodrigo Valério | Diogo Glória-Silva | Yonatan Bitton | Michal Yarom | Idan Szpektor | Joao Magalhaes
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
João Bordalo | Vasco Ramos | Rodrigo Valério | Diogo Glória-Silva | Yonatan Bitton | Michal Yarom | Idan Szpektor | Joao Magalhaes
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Multistep instructions, such as recipes and how-to guides, greatly benefit from visual aids, such as a series of images that accompany the instruction steps. While Large Language Models (LLMs) have become adept at generating coherent textual steps, Large Vision/Language Models (LVLMs) are less capable of generating accompanying image sequences. The most challenging aspect is that each generated image needs to adhere to the relevant textual step instruction, as well as be visually consistent with earlier images in the sequence. To address this problem, we propose an approach for generating consistent image sequences, which integrates a Latent Diffusion Model (LDM) with an LLM to transform the sequence into a caption to maintain the semantic coherence of the sequence. In addition, to maintain the visual coherence of the image sequence, we introduce a copy mechanism to initialise reverse diffusion processes with a latent vector iteration from a previously generated image from a relevant step. Both strategies will condition the reverse diffusion process on the sequence of instruction steps and tie the contents of the current image to previous instruction steps and corresponding images. Experiments show that the proposed approach is preferred by humans in 46.6% of the cases against 26.6% for the second best method. In addition, automatic metrics showed that the proposed method maintains semantic coherence and visual consistency across steps in both domains.
Search
Fix author
Co-authors
- Diogo Glória-Silva 2
- João Magalhães 2
- Duarte M. Alves 1
- Giuseppe Attanasio 1
- Yonatan Bitton 1
- João Bordalo 1
- Inês Calvo 1
- Beatriz Canaverde 1
- Miguel Faria 1
- Rafael Ferreira 1
- James Furtado 1
- Daniel Gomes 1
- Pedro Gomes 1
- Rui Guerra 1
- André F. T. Martins 1
- Iago Paulo 1
- Miguel Moura Ramos 1
- David Semedo 1
- Afonso Simplício 1
- Idan Szpektor 1
- Diogo Tavares 1
- Marcos Treviso 1
- Rodrigo Valério 1
- Inês Vieira 1
- Gonçalo Vinagre 1
- Michal Yarom 1