UMUTeam and SINAI at SemEval-2023 Task 9: Multilingual Tweet Intimacy Analysis using Multilingual Large Language Models and Data Augmentation

José Antonio García-Díaz, Ronghao Pan, Salud María Jiménez Zafra, María-Teresa Martn-Valdivia, L. Alfonso Ureña-López, Rafael Valencia-García


Abstract
This work presents the participation of the UMUTeam and the SINAI research groups in the SemEval-2023 Task 9: Multilingual Tweet Intimacy Analysis. The goal of this task is to predict the intimacy of a set of tweets in 10 languages: English, Spanish, Italian, Portuguese, French, Chinese, Hindi, Arabic, Dutch and Korean, of which, the last 4 are not in the training data. Our approach to address this task is based on data augmentation and the use of three multilingual Large Language Models (multilingual BERT, XLM and mDeBERTA) by ensemble learning. Our team ranked 30th out of 45 participants. Our best results were achieved with two unseen languages: Korean (16th) and Hindi (19th).
Anthology ID:
2023.semeval-1.39
Volume:
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Atul Kr. Ojha, A. Seza Doğruöz, Giovanni Da San Martino, Harish Tayyar Madabushi, Ritesh Kumar, Elisa Sartori
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
293–299
Language:
URL:
https://aclanthology.org/2023.semeval-1.39
DOI:
10.18653/v1/2023.semeval-1.39
Bibkey:
Cite (ACL):
José Antonio García-Díaz, Ronghao Pan, Salud María Jiménez Zafra, María-Teresa Martn-Valdivia, L. Alfonso Ureña-López, and Rafael Valencia-García. 2023. UMUTeam and SINAI at SemEval-2023 Task 9: Multilingual Tweet Intimacy Analysis using Multilingual Large Language Models and Data Augmentation. In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), pages 293–299, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
UMUTeam and SINAI at SemEval-2023 Task 9: Multilingual Tweet Intimacy Analysis using Multilingual Large Language Models and Data Augmentation (García-Díaz et al., SemEval 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.semeval-1.39.pdf