Ernesto Luis Estevanell-Valladares

Also published as: Ernesto Luis Estevanell Valladares


2025

pdf bib
XAutoLM: Efficient Fine-Tuning of Language Models via Meta-Learning and AutoML
Ernesto Luis Estevanell Valladares | Suilan Estevez-Velarde | Yoan Gutierrez | Andrés Montoyo | Ruslan Mitkov
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Experts in machine learning leverage domain knowledge to navigate decisions in model selection, hyperparameter optimization, and resource allocation. This is particularly critical for fine-tuning language models (LMs), where repeated trials incur substantial computational overhead and environmental impact. However, no existing automated framework simultaneously tackles the entire model selection and hyperparameter optimization (HPO) task for resource-efficient LM fine-tuning. We introduce XAutoLM, a meta-learning-augmented AutoML framework that reuses past experiences to optimize discriminative and generative LM fine-tuning pipelines efficiently. XAutoLM learns from stored successes and failures by extracting task- and system-level meta-features to bias its sampling toward valuable configurations and away from costly dead ends. On four text classification and two question-answering benchmarks, XAutoLM surpasses zero-shot optimizer’s peak F1 on five of six tasks, cuts mean evaluation time of pipelines by up to 4.5x, reduces search error ratios by up to sevenfold, and uncovers up to 50% more pipelines above the zero-shot Pareto front. In contrast, simpler memory-based baselines suffer negative transfer. We release XAutoLM and our experience store to catalyze resource-efficient, Green AI fine-tuning in the NLP community.

pdf bib
Proceedings of the First Workshop on Advancing NLP for Low-Resource Languages
Ernesto Luis Estevanell-Valladares | Alicia Picazo-Izquierdo | Tharindu Ranasinghe | Besik Mikaberidze | Simon Ostermann | Daniil Gurgurov | Philipp Mueller | Claudia Borg | Marián Šimko
Proceedings of the First Workshop on Advancing NLP for Low-Resource Languages

pdf bib
Proceedings of the First Workshop on Comparative Performance Evaluation: From Rules to Language Models
Alicia Picazo-Izquierdo | Ernesto Luis Estevanell-Valladares | Ruslan Mitkov | Rafael Muñoz Guillena | Raúl García Cerdá
Proceedings of the First Workshop on Comparative Performance Evaluation: From Rules to Language Models

pdf bib
Detection of AI-generated Content in Scientific Abstracts
Ernesto Luis Estevanell-Valladares | Alicia Picazo-Izquierdo | Ruslan Mitkov
Proceedings of the First Workshop on Comparative Performance Evaluation: From Rules to Language Models

The growing use of generative AI in academic writing raises urgent questions about authorship and the integrity of scientific communication. This study addresses the detection of AI-generated scientific abstracts by constructing a temporally anchored dataset of paired abstracts—each with a human-written version that contains scientific abstracts of works published before 2021 and a synthetic version generated using GPT-4.1. We evaluate three approaches to authorship classification: zero-shot large language models (LLMs), fine-tuned encoder-based transformers, and traditional machine learning classifiers. Results show that LLMs perform near chance level, while a LoRA-fine-tuned DistilBERT and a PassiveAggressive classifier achieve near-perfect performance. These findings suggest that shallow lexical or stylistic patterns still differentiate human and AI writing, and that supervised learning is key to capturing these signals.

pdf bib
Towards Intention-aligned Reviews Summarization: Enhancing LLM Outputs with Pragmatic Cues
Maria Miro Maestre | Robiert Sepulveda-Torres | Ernesto Luis Estevanell-Valladares | Armando Suarez Cueto | Elena Lloret
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era

Recent advancements in Natural Language Processing (NLP) have allowed systems to address complex tasks involving cultural knowledge, multi-step reasoning, and inference. While significant progress has been made in text summarization guided by specific instructions or stylistic cues, the integration of pragmatic aspects like communicative intentions remains underexplored, particularly in non-English languages. This study emphasizes communicative intentions as central to summary generation, classifying Spanish product reviews by intent and using prompt engineering to produce intention-aligned summaries. Results indicate challenges for large language models (LLMs) in processing extensive document clusters, with summarization accuracy heavily dependent on prior model exposure to similar intentions. Common intentions such as complimenting and criticizing are reliably handled, whereas less frequent ones like promising or questioning pose greater difficulties. These findings suggest that integrating communicative intentions into summarization tasks can significantly enhance summary relevance and clarity, thereby improving user experience in product review analysis.

2020

pdf bib
Knowledge Discovery in COVID-19 Research Literature
Alejandro Piad-Morffis | Suilan Estevez-Velarde | Ernesto Luis Estevanell-Valladares | Yoan Gutiérrez | Andrés Montoyo | Rafael Muñoz | Yudivián Almeida-Cruz
Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020

This paper presents the preliminary results of an ongoing project that analyzes the growing body of scientific research published around the COVID-19 pandemic. In this research, a general-purpose semantic model is used to double annotate a batch of 500 sentences that were manually selected by the researchers from the CORD-19 corpus. Afterwards, a baseline text-mining pipeline is designed and evaluated via a large batch of 100,959 sentences. We present a qualitative analysis of the most interesting facts automatically extracted and highlight possible future lines of development. The preliminary results show that general-purpose semantic models are a useful tool for discovering fine-grained knowledge in large corpora of scientific documents.