WKU_NLP at SemEval-2023 Task 9: Translation Augmented Multilingual Tweet Intimacy Analysis

Qinyuan Zheng


Abstract
This paper describes a system for the SemEval 2023 Task 9: Multilingual Tweet Intimacy Analysis. This system consists of a pretrained multilingual masked language model as a text encoder and a neural network as a regression model. Data augmentation based on neural machine translation models is adopted to improve model performance under the low-resource scenario. This system is further improved through the ensemble of multiple models with the best performance in each language. This system ranks 4th in languages unseen in the training data and 16th in languages seen in the training data. The code and data can be found in this link: https://github.com/Cloudy0219/Multilingual.
Anthology ID:
2023.semeval-1.210
Volume:
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Atul Kr. Ojha, A. Seza Doğruöz, Giovanni Da San Martino, Harish Tayyar Madabushi, Ritesh Kumar, Elisa Sartori
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
1525–1530
Language:
URL:
https://aclanthology.org/2023.semeval-1.210
DOI:
10.18653/v1/2023.semeval-1.210
Bibkey:
Cite (ACL):
Qinyuan Zheng. 2023. WKU_NLP at SemEval-2023 Task 9: Translation Augmented Multilingual Tweet Intimacy Analysis. In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), pages 1525–1530, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
WKU_NLP at SemEval-2023 Task 9: Translation Augmented Multilingual Tweet Intimacy Analysis (Zheng, SemEval 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.semeval-1.210.pdf