Classification Approaches to Identify Informative Tweets

Piush Aggarwal


Abstract
Social media platforms have become prime forums for reporting news, with users sharing what they saw, heard or read on social media. News from social media is potentially useful for various stakeholders including aid organizations, news agencies, and individuals. However, social media also contains a vast amount of non-news content. For users to be able to draw on benefits from news reported on social media it is necessary to reliably identify news content and differentiate it from non-news. In this paper, we tackle the challenge of classifying a social post as news or not. To this end, we provide a new manually annotated dataset containing 2,992 tweets from 5 different topical categories. Unlike earlier datasets, it includes postings posted by personal users who do not promote a business or a product and are not affiliated with any organization. We also investigate various baseline systems and evaluate their performance on the newly generated dataset. Our results show that the best classifiers are the SVM and BERT models.
Anthology ID:
R19-2002
Volume:
Proceedings of the Student Research Workshop Associated with RANLP 2019
Month:
September
Year:
2019
Address:
Varna, Bulgaria
Editors:
Venelin Kovatchev, Irina Temnikova, Branislava Šandrih, Ivelina Nikolova
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
7–15
Language:
URL:
https://aclanthology.org/R19-2002
DOI:
10.26615/issn.2603-2821.2019_002
Bibkey:
Cite (ACL):
Piush Aggarwal. 2019. Classification Approaches to Identify Informative Tweets. In Proceedings of the Student Research Workshop Associated with RANLP 2019, pages 7–15, Varna, Bulgaria. INCOMA Ltd..
Cite (Informal):
Classification Approaches to Identify Informative Tweets (Aggarwal, RANLP 2019)
Copy Citation:
PDF:
https://aclanthology.org/R19-2002.pdf