Classification without (Proper) Representation: Political Heterogeneity in Social Media and Its Implications for Classification and Behavioral Analysis

Kenan Alkiek, Bohan Zhang, David Jurgens


Abstract
Reddit is home to a broad spectrum of political activity, and users signal their political affiliations in multiple ways—from self-declarations to community participation. Frequently, computational studies have treated political users as a single bloc, both in developing models to infer political leaning and in studying political behavior. Here, we test this assumption of political users and show that commonly-used political-inference models do not generalize, indicating heterogeneous types of political users. The models remain imprecise at best for most users, regardless of which sources of data or methods are used. Across a 14-year longitudinal analysis, we demonstrate that the choice in definition of a political user has significant implications for behavioral analysis. Controlling for multiple factors, political users are more toxic on the platform and inter-party interactions are even more toxic—but not all political users behave this way. Last, we identify a subset of political users who repeatedly flip affiliations, showing that these users are the most controversial of all, acting as provocateurs by more frequently bringing up politics, and are more likely to be banned, suspended, or deleted.
Anthology ID:
2022.findings-acl.43
Volume:
Findings of the Association for Computational Linguistics: ACL 2022
Month:
May
Year:
2022
Address:
Dublin, Ireland
Venues:
ACL | Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
504–522
Language:
URL:
https://aclanthology.org/2022.findings-acl.43
DOI:
10.18653/v1/2022.findings-acl.43
Bibkey:
Cite (ACL):
Kenan Alkiek, Bohan Zhang, and David Jurgens. 2022. Classification without (Proper) Representation: Political Heterogeneity in Social Media and Its Implications for Classification and Behavioral Analysis. In Findings of the Association for Computational Linguistics: ACL 2022, pages 504–522, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Classification without (Proper) Representation: Political Heterogeneity in Social Media and Its Implications for Classification and Behavioral Analysis (Alkiek et al., Findings 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.findings-acl.43.pdf