2023
pdf
bib
abs
HULAT at SemEval-2023 Task 9: Data Augmentation for Pre-trained Transformers Applied to Multilingual Tweet Intimacy Analysis
Isabel Segura-Bedmar
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
This paper describes our participation in SemEval-2023 Task 9, Intimacy Analysis of Multilingual Tweets. We fine-tune some of the most popular transformer models with the training dataset and synthetic data generated by different data augmentation techniques. During the development phase, our best results were obtained by using XLM-T. Data augmentation techniques provide a very slight improvement in the results. Our system ranked in the 27th position out of the 45 participating systems. Despite its modest results, our system shows promising results in languages such as Portuguese, English, and Dutch. All our code is available in the repository
https://github.com/isegura/hulat_intimacy.
pdf
bib
abs
HULAT at SemEval-2023 Task 10: Data Augmentation for Pre-trained Transformers Applied to the Detection of Sexism in Social Media
Isabel Segura-Bedmar
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
This paper describes our participation in SemEval-2023 Task 10, whose goal is the detection of sexism in social media. We explore some of the most popular transformer models such as BERT, DistilBERT, RoBERTa, and XLNet. We also study different data augmentation techniques to increase the training dataset. During the development phase, our best results were obtained by using RoBERTa and data augmentation for tasks B and C. However, the use of synthetic data does not improve the results for task C. We participated in the three subtasks. Our approach still has much room for improvement, especially in the two fine-grained classifications. All our code is available in the repository
https://github.com/isegura/hulat_edos.
2018
pdf
bib
abs
UC3M-NII Team at SemEval-2018 Task 7: Semantic Relation Classification in Scientific Papers via Convolutional Neural Network
Víctor Suárez-Paniagua
|
Isabel Segura-Bedmar
|
Akiko Aizawa
Proceedings of the 12th International Workshop on Semantic Evaluation
This paper reports our participation for SemEval-2018 Task 7 on extraction and classification of relationships between entities in scientific papers. Our approach is based on the use of a Convolutional Neural Network (CNN) trained on350 abstract with manually annotated entities and relations. Our hypothesis is that this deep learning model can be applied to extract and classify relations between entities for scientific papers at the same time. We use the Part-of-Speech and the distances to the target entities as part of the embedding for each word and we blind all the entities by marker names. In addition, we use sampling techniques to overcome the imbalance issues of this dataset. Our architecture obtained an F1-score of 35.4% for the relation extraction task and 18.5% for the relation classification task with a basic configuration of the one step CNN.
2017
pdf
bib
abs
LABDA at SemEval-2017 Task 10: Extracting Keyphrases from Scientific Publications by combining the BANNER tool and the UMLS Semantic Network
Isabel Segura-Bedmar
|
Cristóbal Colón-Ruiz
|
Paloma Martínez
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)
This paper describes the system presented by the LABDA group at SemEval 2017 Task 10 ScienceIE, specifically for the subtasks of identification and classification of keyphrases from scientific articles. For the task of identification, we use the BANNER tool, a named entity recognition system, which is based on conditional random fields (CRF) and has obtained successful results in the biomedical domain. To classify keyphrases, we study the UMLS semantic network and propose a possible linking between the keyphrase types and the UMLS semantic groups. Based on this semantic linking, we create a dictionary for each keyphrase type. Then, a feature indicating if a token is found in one of these dictionaries is incorporated to feature set used by the BANNER tool. The final results on the test dataset show that our system still needs to be improved, but the conditional random fields and, consequently, the BANNER system can be used as a first approximation to identify and classify keyphrases.
pdf
bib
abs
LABDA at SemEval-2017 Task 10: Relation Classification between keyphrases via Convolutional Neural Network
Víctor Suárez-Paniagua
|
Isabel Segura-Bedmar
|
Paloma Martínez
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)
In this paper, we describe our participation at the subtask of extraction of relationships between two identified keyphrases. This task can be very helpful in improving search engines for scientific articles. Our approach is based on the use of a convolutional neural network (CNN) trained on the training dataset. This deep learning model has already achieved successful results for the extraction relationships between named entities. Thus, our hypothesis is that this model can be also applied to extract relations between keyphrases. The official results of the task show that our architecture obtained an F1-score of 0.38% for Keyphrases Relation Classification. This performance is lower than the expected due to the generic preprocessing phase and the basic configuration of the CNN model, more complex architectures are proposed as future work to increase the classification rate.
pdf
bib
abs
Exploring Convolutional Neural Networks for Sentiment Analysis of Spanish tweets
Isabel Segura-Bedmar
|
Antonio Quirós
|
Paloma Martínez
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers
Spanish is the third-most used language on the internet, after English and Chinese, with a total of 7.7% (more than 277 million of users) and a huge internet growth of more than 1,400%. However, most work on sentiment analysis has been focused on English. This paper describes a deep learning system for Spanish sentiment analysis. To the best of our knowledge, this is the first work that explores the use of a convolutional neural network to polarity classification of Spanish tweets.
2016
pdf
bib
LABDA at the 2016 BioASQ challenge task 4a: Semantic Indexing by using ElasticSearch
Isabel Segura-Bedmar
|
Adrián Carruana
|
Paloma Martínez
Proceedings of the Fourth BioASQ workshop
2015
pdf
bib
Exploring Word Embedding for Drug Name Recognition
Isabel Segura-Bedmar
|
Víctor Suárez-Paniagua
|
Paloma Martínez
Proceedings of the Sixth International Workshop on Health Text Mining and Information Analysis
2014
pdf
bib
Detecting drugs and adverse events from Spanish social media streams
Isabel Segura-Bedmar
|
Ricardo Revert
|
Paloma Martínez
Proceedings of the 5th International Workshop on Health Text Mining and Information Analysis (Louhi)
pdf
bib
Extracting drug indications and adverse drug reactions from Spanish health social media
Isabel Segura-Bedmar
|
Santiago de la Peña González
|
Paloma Martínez
Proceedings of BioNLP 2014
2013
pdf
bib
SemEval-2013 Task 9 : Extraction of Drug-Drug Interactions from Biomedical Texts (DDIExtraction 2013)
Isabel Segura-Bedmar
|
Paloma Martínez
|
María Herrero-Zazo
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)
2008
pdf
bib
A preliminary approach to extract drugs by combining UMLS resources and USAN naming conventions
Isabel Segura-Bedmar
|
Paloma Martínez
|
Doaa Samy
Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
2007
pdf
bib
UCM3: Classification of Semantic Relations between Nominals using Sequential Minimal Optimization
Isabel Segura Bedmar
|
Doaa Samy
|
Jose L. Martinez
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)