Revisiting Supervised Contrastive Learning for Microblog Classification

Junbo Huang, Ricardo Usbeck


Abstract
Microblog content (e.g., Tweets) is noisy due to its informal use of language and its lack of contextual information within each post. To tackle these challenges, state-of-the-art microblog classification models rely on pre-training language models (LMs). However, pre-training dedicated LMs is resource-intensive and not suitable for small labs. Supervised contrastive learning (SCL) has shown its effectiveness with small, available resources. In this work, we examine the effectiveness of fine-tuning transformer-based language models, regularized with a SCL loss for English microblog classification. Despite its simplicity, the evaluation on two English microblog classification benchmarks (TweetEval and Tweet Topic Classification) shows an improvement over baseline models. The result shows that, across all subtasks, our proposed method has a performance gain of up to 11.9 percentage points. All our models are open source.
Anthology ID:
2024.emnlp-main.876
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
15644–15653
Language:
URL:
https://aclanthology.org/2024.emnlp-main.876
DOI:
Bibkey:
Cite (ACL):
Junbo Huang and Ricardo Usbeck. 2024. Revisiting Supervised Contrastive Learning for Microblog Classification. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 15644–15653, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Revisiting Supervised Contrastive Learning for Microblog Classification (Huang & Usbeck, EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.876.pdf