Emily Sofi Ohman
2024
Text Length and the Function of Intentionality: A Case Study of Contrastive Subreddits
Emily Sofi Ohman
|
Aatu Liimatta
Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities
Text length is of central concern in natural language processing (NLP) tasks, yet it is very much under-researched. In this paper, we use social media data, specifically Reddit, to explore the function of text length and intentionality by contrasting subreddits of the same topic where one is considered more serious/professional/academic and the other more relaxed/beginner/layperson. We hypothesize that word choices are more deliberate and intentional in the more in-depth and professional subreddits with texts subsequently becoming longer as a function of this intentionality. We argue that this has deep implications for many applied NLP tasks such as emotion and sentiment analysis, fake news and disinformation detection, and other modeling tasks focused on social media and similar platforms where users interact with each other via the medium of text.
Search