L. Alfonso Urena Lopez

Also published as: Alfonso Ureña-López, L. Alfonso Urena, L. Alfonso Ureña López, L. Alfonso Urena-López, L. Alfonso Ureña- López, L. Alfonso Ureña-López, Luis Alfonso Ureña-López, L. Alfonso Urena-Lopez, L. Alfonso Ureña-lópez

Other people with similar names: L. Alfonso Ureña

2024

pdf bib abs

Question Answering over Tabular Data with DataBench: A Large-Scale Empirical Evaluation of LLMs
Jorge Osés Grijalba | L. Alfonso Ureña-López | Eugenio Martínez Cámara | Jose Camacho-Collados
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Large Language Models (LLMs) are showing emerging abilities, and one of the latest recognized ones deals with their ability to reason and answer questions from tabular data. Although there are some available datasets to assess question answering systems on tabular data, they are not large and diverse enough to properly assess the capabilities of LLMs. To this end, we propose DataBench, a benchmark composed of 65 real-world datasets over several domains, including 20 human-generated questions per dataset, totaling 1300 questions and answers overall. Using this benchmark, we perform a large-scale empirical comparison of several open and closed source models, including both code-generating and in-context learning models. The results highlight the current gap between open-source and closed-source models, with all types of model having room for improvement even in simple boolean questions or involving a single column.

pdf bib abs

SINAI at SemEval-2024 Task 8: Fine-tuning on Words and Perplexity as Features for Detecting Machine Written Text
Alberto Gutiérrez Megías | L. Alfonso Ureña-lópez | Eugenio Martínez Cámara
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

This work presents the proposed systems of the SINAI team for the subtask A of the Task 8 in SemEval 2024. We present the evaluation of two disparate systems, and our final submitted system. We claim that the perplexity value of a text may be used as classification signal. Accordingly, we conduct a study on the utility of perplexity for discerning text authorship, and we perform a comparative analysis of the results obtained on the datasets of the task. This comparative evaluation includes results derived from the systems evaluated, such as fine-tuning using an XLM-RoBERTa-Large transformer or using perplexity as a classification criterion. In addition, we discuss the results reached on the test set, where we show that there is large differences among the language probability distribution of the training and test sets. These analysis allows us to open new research lines to improve the detection of machine-generated text.

pdf bib abs

SINAI at BioLaySumm: Self-Play Fine-Tuning of Large Language Models for Biomedical Lay Summarisation
Mariia Chizhikova | Manuel Carlos Díaz-Galiano | L. Alfonso Ureña-López | María-Teresa Martín-Valdivia
Proceedings of the 23rd Workshop on Biomedical Natural Language Processing

An effective disclosure of scientific knowledge and advancements to the general public is often hindered by the complexity of the technical language used in research which often results very difficult, if not impossible, for non-experts to understand. In this paper we present the approach developed by the SINAI team as the result of our participation in BioLaySumm shared task hosted by the BioNLP workshop at ACL 2024. Our approach stems from the experimentation we performed in order to test the ability of state-of-the-art pre-trained large language models, namely GPT 3.5, GPT 4 and Llama-3, to tackle this task in a few-shot manner. In order to improve this baseline, we opted for fine-tuning Llama-3 by applying parameter-efficient methodologies. The best performing system which resulted from applying self-play fine tuning method which allows the model to improve while learning to distinguish between its own generations from the previous step from the gold standard summaries. This approach achieved 0.4205 ROUGE-1 score and 0.8583 BERTScore.

pdf bib abs

The Influence of the Perplexity Score in the Detection of Machine-generated Texts
Alberto José Gutiérrez Megías | L. Alfonso Ureña-López | Eugenio Martínez Cámara
Proceedings of the First International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security

The high performance of large language models (LLM) generating natural language represents a real threat, since they can be leveraged to generate any kind of deceptive content. Since there are still disparities among the language generated by machines and the human language, we claim that perplexity may be used as classification signal to discern between machine and human text. We propose a classification model based on XLM-RoBERTa, and we evaluate it on the M4 dataset. The results show that the perplexity score is useful for the identification of machine generated text, but it is constrained by the differences among the LLMs used in the training and test sets.

2023

pdf bib abs

UMUTeam and SINAI at SemEval-2023 Task 9: Multilingual Tweet Intimacy Analysis using Multilingual Large Language Models and Data Augmentation
José Antonio García-Díaz | Ronghao Pan | Salud María Jiménez Zafra | María-Teresa Martn-Valdivia | L. Alfonso Ureña-López | Rafael Valencia-García
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

This work presents the participation of the UMUTeam and the SINAI research groups in the SemEval-2023 Task 9: Multilingual Tweet Intimacy Analysis. The goal of this task is to predict the intimacy of a set of tweets in 10 languages: English, Spanish, Italian, Portuguese, French, Chinese, Hindi, Arabic, Dutch and Korean, of which, the last 4 are not in the training data. Our approach to address this task is based on data augmentation and the use of three multilingual Large Language Models (multilingual BERT, XLM and mDeBERTA) by ensemble learning. Our team ranked 30th out of 45 participants. Our best results were achieved with two unseen languages: Korean (16th) and Hindi (19th).

pdf bib abs

SINAI at SemEval-2023 Task 10: Leveraging Emotions, Sentiments, and Irony Knowledge for Explainable Detection of Online Sexism
M. Estrella Vallecillo Rodríguez | Flor Miriam Plaza-del-Arco | L. Alfonso Ureña-López | M. Teresa Martín-Valdivia
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

This paper describes the participation of SINAI research team in the Explainable Detection of Online Sexism (EDOS) Shared Task at SemEval 2023. Specifically, we participate in subtask A (binary sexism detection), subtask B (category of sexism), and subtask C (fine-grained vector of sexism). For the three subtasks, we propose a system that integrates information related to emotions, sentiments, and irony in order to check whether these features help detect sexism content. Our team ranked 46th in subtask A, 37th in subtask B, and 29th in subtask C, achieving 0.8245, 0.6043, and 0.4376 of macro f1-score, respectively, among the participants.

pdf bib abs

SINAI at RadSum23: Radiology Report Summarization Based on Domain-Specific Sequence-To-Sequence Transformer Model
Mariia Chizhikova | Manuel Diaz-Galiano | L. Alfonso Urena-Lopez | M. Teresa Martin-Valdivia
Proceedings of the 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks

This paper covers participation of the SINAI team in the shared task 1B: Radiology Report Summarization at the BioNLP workshop held on ACL 2023. Our proposal follows a sequence-to-sequence approach which leverages pre-trained multilingual general domain and monolingual biomedical domain pre-trained language models. The best performing system based on domain-specific model reached 33.96 F1RadGraph score which is the fourth best result among the challenge participants. This model was made publicly available on HuggingFace. We also describe an attempt of Proximal Policy Optimization Reinforcement Learning that was made in order to improve the factual correctness measured with F1RadGraph but did not lead to satisfactory results.

2022

pdf bib abs

SINAI@SMM4H’22: Transformers for biomedical social media text mining in Spanish
Mariia Chizhikova | Pilar López-Úbeda | Manuel C. Díaz-Galiano | L. Alfonso Ureña-López | M. Teresa Martín-Valdivia
Proceedings of the Seventh Workshop on Social Media Mining for Health Applications, Workshop & Shared Task

This paper covers participation of the SINAI team in Tasks 5 and 10 of the Social Media Mining for Health (#SSM4H) workshop at COLING-2022. These tasks focus on leveraging Twitter posts written in Spanish for healthcare research. The objective of Task 5 was to classify tweets reporting COVID-19 symptoms, while Task 10 required identifying disease mentions in Twitter posts. The presented systems explore large RoBERTa language models pre-trained on Twitter data in the case of tweet classification task and general-domain data for the disease recognition task. We also present a text pre-processing methodology implemented in both systems and describe an initial weakly-supervised fine-tuning phase alongside with a submission post-processing procedure designed for Task 10. The systems obtained 0.84 F1-score on the Task 5 and 0.77 F1-score on Task 10.

2021

pdf bib abs

Identifying professions & occupations in Health-related Social Media using Natural Language Processing
Alberto Mesa Murgado | Ana Parras Portillo | Pilar López Úbeda | Maite Martin | Alfonso Ureña-López
Proceedings of the Sixth Social Media Mining for Health (#SMM4H) Workshop and Shared Task

This paper describes the entry of the research group SINAI at SMM4H’s ProfNER task on the identification of professions and occupations in social media related with health. Specifically we have participated in Task 7a: Tweet Binary Classification to determine whether a tweet contains mentions of occupations or not, as well as in Task 7b: NER Offset Detection and Classification aimed at predicting occupations mentions and classify them discriminating by professions and working statuses.

pdf bib abs

OffendES: A New Corpus in Spanish for Offensive Language Research
Flor Miriam Plaza-del-Arco | Arturo Montejo-Ráez | L. Alfonso Ureña-López | María-Teresa Martín-Valdivia
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

Offensive language detection and analysis has become a major area of research in Natural Language Processing. The freedom of participation in social media has exposed online users to posts designed to denigrate, insult or hurt them according to gender, race, religion, ideology, or other personal characteristics. Focusing on young influencers from the well-known social platforms of Twitter, Instagram, and YouTube, we have collected a corpus composed of 47,128 Spanish comments manually labeled on offensive pre-defined categories. A subset of the corpus attaches a degree of confidence to each label, so both multi-class classification and multi-output regression studies are possible. In this paper, we introduce the corpus, discuss its building process, novelties, and some preliminary experiments with it to serve as a baseline for the research community.

pdf bib abs

SINAI at SemEval-2021 Task 5: Combining Embeddings in a BiLSTM-CRF model for Toxic Spans Detection
Flor Miriam Plaza-del-Arco | Pilar López-Úbeda | L. Alfonso Ureña-López | M. Teresa Martín-Valdivia
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

This paper describes the participation of SINAI team at Task 5: Toxic Spans Detection which consists of identifying spans that make a text toxic. Although several resources and systems have been developed so far in the context of offensive language, both annotation and tasks have mainly focused on classifying whether a text is offensive or not. However, detecting toxic spans is crucial to identify why a text is toxic and can assist human moderators to locate this type of content on social media. In order to accomplish the task, we follow a deep learning-based approach using a Bidirectional variant of a Long Short Term Memory network along with a stacked Conditional Random Field decoding layer (BiLSTM-CRF). Specifically, we test the performance of the combination of different pre-trained word embeddings for recognizing toxic entities in text. The results show that the combination of word embeddings helps in detecting offensive content. Our team ranks 29th out of 91 participants.

2020

pdf bib abs

Corpora Annotated with Negation: An Overview
Salud María Jiménez-Zafra | Roser Morante | María Teresa Martín-Valdivia | L. Alfonso Ureña-López
Computational Linguistics, Volume 46, Issue 1 - March 2020

Negation is a universal linguistic phenomenon with a great qualitative impact on natural language processing applications. The availability of corpora annotated with negation is essential to training negation processing systems. Currently, most corpora have been annotated for English, but the presence of languages other than English on the Internet, such as Chinese or Spanish, is greater every day. In this study, we present a review of the corpora annotated with negation information in several languages with the goal of evaluating what aspects of negation have been annotated and how compatible the corpora are. We conclude that it is very difficult to merge the existing corpora because we found differences in the annotation schemes used, and most importantly, in the annotation guidelines: the way in which each corpus was tokenized and the negation elements that have been annotated. Differently than for other well established tasks like semantic role labeling or parsing, for negation there is no standard annotation scheme nor guidelines, which hampers progress in its treatment.

pdf bib abs

SINAI at SemEval-2020 Task 12: Offensive Language Identification Exploring Transfer Learning Models
Flor Miriam Plaza del Arco | M. Dolores Molina González | Alfonso Ureña-López | Maite Martin
Proceedings of the Fourteenth Workshop on Semantic Evaluation

This paper describes the participation of SINAI team at Task 12: OffensEval 2: Multilingual Offensive Language Identification in Social Media. In particular, the participation in Sub-task A in English which consists of identifying tweets as offensive or not offensive. We preprocess the dataset according to the language characteristics used on social media. Then, we select a small set from the training set provided by the organizers and fine-tune different Transformerbased models in order to test their effectiveness. Our team ranks 20th out of 85 participants in Subtask-A using the XLNet model.

pdf bib abs

Detecting Negation Cues and Scopes in Spanish
Salud María Jiménez-Zafra | Roser Morante | Eduardo Blanco | María Teresa Martín Valdivia | L. Alfonso Ureña López
Proceedings of the Twelfth Language Resources and Evaluation Conference

In this work we address the processing of negation in Spanish. We first present a machine learning system that processes negation in Spanish. Specifically, we focus on two tasks: i) negation cue detection and ii) scope identification. The corpus used in the experimental framework is the SFU Corpus. The results for cue detection outperform state-of-the-art results, whereas for scope detection this is the first system that performs the task for Spanish. Moreover, we provide a qualitative error analysis aimed at understanding the limitations of the system and showing which negation cues and scopes are straightforward to predict automatically, and which ones are challenging.

pdf bib abs

Transfer learning applied to text classification in Spanish radiological reports
Pilar López Úbeda | Manuel Carlos Díaz-Galiano | L. Alfonso Urena Lopez | Maite Martin | Teodoro Martín-Noguerol | Antonio Luna
Proceedings of the LREC 2020 Workshop on Multilingual Biomedical Text Processing (MultilingualBIO 2020)

Pre-trained text encoders have rapidly advanced the state-of-the-art on many Natural Language Processing tasks. This paper presents the use of transfer learning methods applied to the automatic detection of codes in radiological reports in Spanish. Assigning codes to a clinical document is a popular task in NLP and in the biomedical domain. These codes can be of two types: standard classifications (e.g. ICD-10) or specific to each clinic or hospital. In this study we show a system using specific radiology clinic codes. The dataset is composed of 208,167 radiology reports labeled with 89 different codes. The corpus has been evaluated with three methods using the BERT model applied to Spanish: Multilingual BERT, BETO and XLM. The results are interesting obtaining 70% of F1-score with a pre-trained multilingual model.

pdf bib abs

EmoEvent: A Multilingual Emotion Corpus based on different Events
Flor Miriam Plaza del Arco | Carlo Strapparava | L. Alfonso Urena Lopez | Maite Martin
Proceedings of the Twelfth Language Resources and Evaluation Conference

In recent years emotion detection in text has become more popular due to its potential applications in fields such as psychology, marketing, political science, and artificial intelligence, among others. While opinion mining is a well-established task with many standard data sets and well-defined methodologies, emotion mining has received less attention due to its complexity. In particular, the annotated gold standard resources available are not enough. In order to address this shortage, we present a multilingual emotion data set based on different events that took place in April 2019. We collected tweets from the Twitter platform. Then one of seven emotions, six Ekman’s basic emotions plus the “neutral or other emotions”, was labeled on each tweet by 3 Amazon MTurkers. A total of 8,409 in Spanish and 7,303 in English were labeled. In addition, each tweet was also labeled as offensive or no offensive. We report some linguistic statistics about the data set in order to observe the difference between English and Spanish speakers when they express emotions related to the same events. Moreover, in order to validate the effectiveness of the data set, we also propose a machine learning approach for automatically detecting emotions in tweets for both languages, English and Spanish.

2019

pdf bib abs

Using Snomed to recognize and index chemical and drug mentions.
Pilar López Úbeda | Manuel Carlos Díaz Galiano | L. Alfonso Urena Lopez | Maite Martin
Proceedings of the 5th Workshop on BioNLP Open Shared Tasks

In this paper we describe a new named entity extraction system. Our work proposes a system for the identification and annotation of drug names in Spanish biomedical texts based on machine learning and deep learning models. Subsequently, a standardized code using Snomed is assigned to these drugs, for this purpose, Natural Language Processing tools and techniques have been used, and a dictionary of different sources of information has been built. The results are promising, we obtain 78% in F1 score on the first sub-track and in the second task we map with Snomed correctly 72% of the found entities.

pdf bib abs

Detecting Anorexia in Spanish Tweets
Pilar López Úbeda | Flor Miriam Plaza del Arco | Manuel Carlos Díaz Galiano | L. Alfonso Urena Lopez | Maite Martin
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)

Mental health is one of the main concerns of today’s society. Early detection of symptoms can greatly help people with mental disorders. People are using social networks more and more to express emotions, sentiments and mental states. Thus, the treatment of this information using NLP technologies can be applied to the automatic detection of mental problems such as eating disorders. However, the first step to solving the problem should be to provide a corpus in order to evaluate our systems. In this paper, we specifically focus on detecting anorexia messages on Twitter. Firstly, we have generated a new corpus of tweets extracted from different accounts including anorexia and non-anorexia messages in Spanish. The corpus is called SAD: Spanish Anorexia Detection corpus. In order to validate the effectiveness of the SAD corpus, we also propose several machine learning approaches for automatically detecting anorexia symptoms in the corpus. The good results obtained show that the application of textual classification methods is a promising option for developing this kind of system demonstrating that these tools could be used by professionals to help in the early detection of mental problems.

pdf bib abs

SINAI at SemEval-2019 Task 3: Using affective features for emotion classification in textual conversations
Flor Miriam Plaza-del-Arco | M. Dolores Molina-González | Maite Martin | L. Alfonso Ureña-López
Proceedings of the 13th International Workshop on Semantic Evaluation

Detecting emotions in textual conversation is a challenging problem in absence of nonverbal cues typically associated with emotion, like fa- cial expression or voice modulations. How- ever, more and more users are using message platforms such as WhatsApp or Telegram. For this reason, it is important to develop systems capable of understanding human emotions in textual conversations. In this paper, we carried out different systems to analyze the emotions of textual dialogue from SemEval-2019 Task 3: EmoContext for English language. Our main contribution is the integration of emotional and sentimental features in the classification using the SVM algorithm.

pdf bib abs

SINAI at SemEval-2019 Task 6: Incorporating lexicon knowledge into SVM learning to identify and categorize offensive language in social media
Flor Miriam Plaza-del-Arco | M. Dolores Molina-González | Maite Martin | L. Alfonso Ureña-López
Proceedings of the 13th International Workshop on Semantic Evaluation

Offensive language has an impact across society. The use of social media has aggravated this issue among online users, causing suicides in the worst cases. For this reason, it is important to develop systems capable of identifying and detecting offensive language in text automatically. In this paper, we developed a system to classify offensive tweets as part of our participation in SemEval-2019 Task 6: OffensEval. Our main contribution is the integration of lexical features in the classification using the SVM algorithm.

pdf bib abs

Using Machine Learning and Deep Learning Methods to Find Mentions of Adverse Drug Reactions in Social Media
Pilar López Úbeda | Manuel Carlos Díaz Galiano | Maite Martin | L. Alfonso Urena Lopez
Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task

Over time the use of social networks is becoming very popular platforms for sharing health related information. Social Media Mining for Health Applications (SMM4H) provides tasks such as those described in this document to help manage information in the health domain. This document shows the first participation of the SINAI group. We study approaches based on machine learning and deep learning to extract adverse drug reaction mentions from Twitter. The results obtained in the tasks are encouraging, we are close to the average of all participants and even above in some cases.

pdf bib abs

SINAI at SemEval-2019 Task 5: Ensemble learning to detect hate speech against inmigrants and women in English and Spanish tweets
Flor Miriam Plaza-del-Arco | M. Dolores Molina-González | Maite Martin | L. Alfonso Ureña-López
Proceedings of the 13th International Workshop on Semantic Evaluation

Misogyny and xenophobia are some of the most important social problems. With the in- crease in the use of social media, this feeling ofhatred towards women and immigrants can be more easily expressed, therefore it can cause harmful effects on social media users. For this reason, it is important to develop systems ca- pable of detecting hateful comments automatically. In this paper, we describe our system to analyze the hate speech in English and Spanish tweets against Immigrants and Women as part of our participation in SemEval-2019 Task 5: hatEval. Our main contribution is the integration of three individual algorithms of predic- tion in a model based on Vote ensemble classifier.

2018

pdf bib abs

A review of Spanish corpora annotated with negation
Salud María Jiménez-Zafra | Roser Morante | Maite Martin | L. Alfonso Ureña-López
Proceedings of the 27th International Conference on Computational Linguistics

The availability of corpora annotated with negation information is essential to develop negation processing systems in any language. However, there is a lack of these corpora even for languages like English, and when there are corpora available they are small and the annotations are not always compatible across corpora. In this paper we review the existing corpora annotated with negation in Spanish with the purpose of first, gathering the information to make it available for other researchers and, second, analyzing how compatible are the corpora and how has the linguistic phenomenon been addressed. Our final aim is to develop a supervised negation processing system for Spanish, for which we need training and test data. Our analysis shows that it will not be possible to merge the small corpora existing for Spanish due to lack of compatibility in the annotations.

pdf bib abs

SINAI at SemEval-2018 Task 1: Emotion Recognition in Tweets
Flor Miriam Plaza-del-Arco | Salud María Jiménez-Zafra | Maite Martin | L. Alfonso Ureña-López
Proceedings of the 12th International Workshop on Semantic Evaluation

Emotion classification is a new task that combines several disciplines including Artificial Intelligence and Psychology, although Natural Language Processing is perhaps the most challenging area. In this paper, we describe our participation in SemEval-2018 Task1: Affect in Tweets. In particular, we have participated in EI-oc, EI-reg and E-c subtasks for English and Spanish languages.

pdf bib abs

SINAI at IEST 2018: Neural Encoding of Emotional External Knowledge for Emotion Classification
Flor Miriam Plaza-del-Arco | Eugenio Martínez-Cámara | Maite Martin | L. Alfonso Ureña- López
Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

In this paper, we describe our participation in WASSA 2018 Implicit Emotion Shared Task (IEST 2018). We claim that the use of emotional external knowledge may enhance the performance and the capacity of generalization of an emotion classification system based on neural networks. Accordingly, we submitted four deep learning systems grounded in a sequence encoding layer. They mainly differ in the feature vector space and the recurrent neural network used in the sequence encoding layer. The official results show that the systems that used emotional external knowledge have a higher capacity of generalization, hence our claim holds.

2017

pdf bib abs

SINAI at SemEval-2017 Task 4: User based classification
Salud María Jiménez-Zafra | Arturo Montejo-Ráez | Maite Martin | L. Alfonso Ureña-López
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

This document describes our participation in SemEval-2017 Task 4: Sentiment Analysis in Twitter. We have only reported results for subtask B - English, determining the polarity towards a topic on a two point scale (positive or negative sentiment). Our main contribution is the integration of user information in the classification process. A SVM model is trained with Word2Vec vectors from user’s tweets extracted from his timeline. The obtained results show that user-specific classifiers trained on tweets from user timeline can introduce noise as they are error prone because they are classified by an imperfect system. This encourages us to explore further integration of user information for author-based Sentiment Analysis.

2016

pdf bib abs

Problematic Cases in the Annotation of Negation in Spanish
Salud María Jiménez-Zafra | Maite Martin | L. Alfonso Ureña-López | Toni Martí | Mariona Taulé
Proceedings of the Workshop on Extra-Propositional Aspects of Meaning in Computational Linguistics (ExProM)

This paper presents the main sources of disagreement found during the annotation of the Spanish SFU Review Corpus with negation (SFU ReviewSP -NEG). Negation detection is a challenge in most of the task related to NLP, so the availability of corpora annotated with this phenomenon is essential in order to advance in tasks related to this area. A thorough analysis of the problems found during the annotation could help in the study of this phenomenon.

pdf bib

Domain Adaptation of Polarity Lexicon combining Term Frequency and Bootstrapping
Salud María Jiménez-Zafra | Maite Martin | M. Dolores Molina-Gonzalez | L. Alfonso Ureña-López
Proceedings of the 7th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis