Using Author Embeddings to Improve Tweet Stance Classification

Adrian Benton, Mark Dredze


Abstract
Many social media classification tasks analyze the content of a message, but do not consider the context of the message. For example, in tweet stance classification – where a tweet is categorized according to a viewpoint it espouses – the expressed viewpoint depends on latent beliefs held by the user. In this paper we investigate whether incorporating knowledge about the author can improve tweet stance classification. Furthermore, since author information and embeddings are often unavailable for labeled training examples, we propose a semi-supervised pretraining method to predict user embeddings. Although the neural stance classifiers we learn are often outperformed by a baseline SVM, author embedding pre-training yields improvements over a non-pre-trained neural network on four out of five domains in the SemEval 2016 6A tweet stance classification task. In a tweet gun control stance classification dataset, improvements from pre-training are only apparent when training data is limited.
Anthology ID:
W18-6124
Volume:
Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text
Month:
November
Year:
2018
Address:
Brussels, Belgium
Editors:
Wei Xu, Alan Ritter, Tim Baldwin, Afshin Rahimi
Venue:
WNUT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
184–194
Language:
URL:
https://aclanthology.org/W18-6124
DOI:
10.18653/v1/W18-6124
Bibkey:
Cite (ACL):
Adrian Benton and Mark Dredze. 2018. Using Author Embeddings to Improve Tweet Stance Classification. In Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text, pages 184–194, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Using Author Embeddings to Improve Tweet Stance Classification (Benton & Dredze, WNUT 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-6124.pdf