iREL at SemEval-2023 Task 9: Improving understanding of multilingual Tweets using Translation-Based Augmentation and Domain Adapted Pre-Trained Models

Bhavyajeet Singh, Ankita Maity, Pavan Kandru, Aditya Hari, Vasudeva Varma


Abstract
This paper describes our system (iREL) for Tweet intimacy analysis sharedtask of the SemEval 2023 workshop at ACL 2023. Oursystem achieved an overall Pearson’s r score of 0.5924 and ranked 10th on the overall leaderboard. For the unseen languages, we ranked third on the leaderboard and achieved a Pearson’s r score of 0.485. We used a single multilingual model for all languages, as discussed in this paper. We provide a detailed description of our pipeline along with multiple ablation experiments to further analyse each component of the pipeline. We demonstrate how translation-based augmentation, domain-specific features, and domain-adapted pre-trained models improve the understanding of intimacy in tweets. The codecan be found at \href{https://github.com/bhavyajeet/Multilingual-tweet-intimacy}{https://github.com/bhavyajeet/Multilingual-tweet-intimacy}
Anthology ID:
2023.semeval-1.282
Volume:
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Atul Kr. Ojha, A. Seza Doğruöz, Giovanni Da San Martino, Harish Tayyar Madabushi, Ritesh Kumar, Elisa Sartori
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
2052–2057
Language:
URL:
https://aclanthology.org/2023.semeval-1.282
DOI:
10.18653/v1/2023.semeval-1.282
Bibkey:
Cite (ACL):
Bhavyajeet Singh, Ankita Maity, Pavan Kandru, Aditya Hari, and Vasudeva Varma. 2023. iREL at SemEval-2023 Task 9: Improving understanding of multilingual Tweets using Translation-Based Augmentation and Domain Adapted Pre-Trained Models. In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), pages 2052–2057, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
iREL at SemEval-2023 Task 9: Improving understanding of multilingual Tweets using Translation-Based Augmentation and Domain Adapted Pre-Trained Models (Singh et al., SemEval 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.semeval-1.282.pdf