Predicting Twitter User Demographics from Names Alone

Zach Wood-Doughty, Nicholas Andrews, Rebecca Marvin, Mark Dredze


Abstract
Social media analysis frequently requires tools that can automatically infer demographics to contextualize trends. These tools often require hundreds of user-authored messages for each user, which may be prohibitive to obtain when analyzing millions of users. We explore character-level neural models that learn a representation of a user’s name and screen name to predict gender and ethnicity, allowing for demographic inference with minimal data. We release trained models1 which may enable new demographic analyses that would otherwise require enormous amounts of data collection
Anthology ID:
W18-1114
Volume:
Proceedings of the Second Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media
Month:
June
Year:
2018
Address:
New Orleans, Louisiana, USA
Editors:
Malvina Nissim, Viviana Patti, Barbara Plank, Claudia Wagner
Venue:
PEOPLES
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
105–111
Language:
URL:
https://aclanthology.org/W18-1114
DOI:
10.18653/v1/W18-1114
Bibkey:
Cite (ACL):
Zach Wood-Doughty, Nicholas Andrews, Rebecca Marvin, and Mark Dredze. 2018. Predicting Twitter User Demographics from Names Alone. In Proceedings of the Second Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media, pages 105–111, New Orleans, Louisiana, USA. Association for Computational Linguistics.
Cite (Informal):
Predicting Twitter User Demographics from Names Alone (Wood-Doughty et al., PEOPLES 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-1114.pdf
Code
 mdredze/demographer