Robiert Sepúlveda-Torres

Also published as: Robiert Sepulveda Torres


2025

pdf bib
IberoBench: A Benchmark for LLM Evaluation in Iberian Languages
Irene Baucells | Javier Aula-Blasco | Iria de-Dios-Flores | Silvia Paniagua Suárez | Naiara Perez | Anna Salles | Susana Sotelo Docio | Júlia Falcão | Jose Javier Saiz | Robiert Sepulveda Torres | Jeremy Barnes | Pablo Gamallo | Aitor Gonzalez-Agirre | German Rigau | Marta Villegas
Proceedings of the 31st International Conference on Computational Linguistics

The current best practice to measure the performance of base Large Language Models is to establish a multi-task benchmark that covers a range of capabilities of interest. Currently, however, such benchmarks are only available in a few high-resource languages. To address this situation, we present IberoBench, a multilingual, multi-task benchmark for Iberian languages (i.e., Basque, Catalan, Galician, European Spanish and European Portuguese) built on the LM Evaluation Harness framework. The benchmark consists of 62 tasks divided into 179 subtasks. We evaluate 33 existing LLMs on IberoBench on 0- and 5-shot settings. We also explore the issues we encounter when working with the Harness and our approach to solving them to ensure high-quality evaluation.

2019

pdf bib
Team GPLSI. Approach for automated fact checking
Aimée Alonso-Reina | Robiert Sepúlveda-Torres | Estela Saquete | Manuel Palomar
Proceedings of the Second Workshop on Fact Extraction and VERification (FEVER)

Fever Shared 2.0 Task is a challenge meant for developing automated fact checking systems. Our approach for the Fever 2.0 is based on a previous proposal developed by Team Athene UKP TU Darmstadt. Our proposal modifies the sentence retrieval phase, using statement extraction and representation in the form of triplets (subject, object, action). Triplets are extracted from the claim and compare to triplets extracted from Wikipedia articles using semantic similarity. Our results are satisfactory but there is room for improvement.