HULAT at SemEval-2023 Task 9: Data Augmentation for Pre-trained Transformers Applied to Multilingual Tweet Intimacy Analysis

Isabel Segura-Bedmar


Abstract
This paper describes our participation in SemEval-2023 Task 9, Intimacy Analysis of Multilingual Tweets. We fine-tune some of the most popular transformer models with the training dataset and synthetic data generated by different data augmentation techniques. During the development phase, our best results were obtained by using XLM-T. Data augmentation techniques provide a very slight improvement in the results. Our system ranked in the 27th position out of the 45 participating systems. Despite its modest results, our system shows promising results in languages such as Portuguese, English, and Dutch. All our code is available in the repository https://github.com/isegura/hulat_intimacy.
Anthology ID:
2023.semeval-1.25
Volume:
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Atul Kr. Ojha, A. Seza Doğruöz, Giovanni Da San Martino, Harish Tayyar Madabushi, Ritesh Kumar, Elisa Sartori
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
177–183
Language:
URL:
https://aclanthology.org/2023.semeval-1.25
DOI:
10.18653/v1/2023.semeval-1.25
Bibkey:
Cite (ACL):
Isabel Segura-Bedmar. 2023. HULAT at SemEval-2023 Task 9: Data Augmentation for Pre-trained Transformers Applied to Multilingual Tweet Intimacy Analysis. In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), pages 177–183, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
HULAT at SemEval-2023 Task 9: Data Augmentation for Pre-trained Transformers Applied to Multilingual Tweet Intimacy Analysis (Segura-Bedmar, SemEval 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.semeval-1.25.pdf