Tsingriver at SemEval-2023 Task 10: Labeled Data Augmentation in Consistency Training

Yehui Xu, Haiyan Ding


Abstract
Semi-supervised learning has promising performance in deep learning, one of the approaches is consistency training on a large amount of unlabeled data to constrain model predictions to be invariant to input noise. However, The degree of correlation between unlabeled data and task objective directly affects model prediction performance. This paper describes our system designed for SemEval-2023 Task 10: Explainable Detection of Online Sexism. We utilize a consistency training framework and data augmentation as the main strategy to train a model. The score obtained by our method is 0.8180 in subtask A, ranking 57 in all the teams.
Anthology ID:
2023.semeval-1.108
Volume:
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Atul Kr. Ojha, A. Seza Doğruöz, Giovanni Da San Martino, Harish Tayyar Madabushi, Ritesh Kumar, Elisa Sartori
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
782–786
Language:
URL:
https://aclanthology.org/2023.semeval-1.108
DOI:
10.18653/v1/2023.semeval-1.108
Bibkey:
Cite (ACL):
Yehui Xu and Haiyan Ding. 2023. Tsingriver at SemEval-2023 Task 10: Labeled Data Augmentation in Consistency Training. In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), pages 782–786, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Tsingriver at SemEval-2023 Task 10: Labeled Data Augmentation in Consistency Training (Xu & Ding, SemEval 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.semeval-1.108.pdf