Tkol, Httt, and r/radiohead: High Affinity Terms in Reddit Communities

Abhinav Bhandari, Caitrin Armstrong


Abstract
Language is an important marker of a cultural group, large or small. One aspect of language variation between communities is the employment of highly specialized terms with unique significance to the group. We study these high affinity terms across a wide variety of communities by leveraging the rich diversity of Reddit.com. We provide a systematic exploration of high affinity terms, the often rapid semantic shifts they undergo, and their relationship to subreddit characteristics across 2600 diverse subreddits. Our results show that high affinity terms are effective signals of loyal communities, they undergo more semantic shift than low affinity terms, and that they are partial barrier to entry for new users. We conclude that Reddit is a robust and valuable data source for testing further theories about high affinity terms across communities.
Anthology ID:
D19-5508
Volume:
Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019)
Month:
November
Year:
2019
Address:
Hong Kong, China
Editors:
Wei Xu, Alan Ritter, Tim Baldwin, Afshin Rahimi
Venue:
WNUT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
57–67
Language:
URL:
https://aclanthology.org/D19-5508
DOI:
10.18653/v1/D19-5508
Bibkey:
Cite (ACL):
Abhinav Bhandari and Caitrin Armstrong. 2019. Tkol, Httt, and r/radiohead: High Affinity Terms in Reddit Communities. In Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019), pages 57–67, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):
Tkol, Httt, and r/radiohead: High Affinity Terms in Reddit Communities (Bhandari & Armstrong, WNUT 2019)
Copy Citation:
PDF:
https://aclanthology.org/D19-5508.pdf