Why Swear? Analyzing and Inferring the Intentions of Vulgar Expressions

Eric Holgate; Isabel Cachola; Daniel Preoţiuc-Pietro; Junyi Jessy Li

doi:10.18653/v1/D18-1471

Why Swear? Analyzing and Inferring the Intentions of Vulgar Expressions

Eric Holgate, Isabel Cachola, Daniel Preoţiuc-Pietro, Junyi Jessy Li

Abstract

Vulgar words are employed in language use for several different functions, ranging from expressing aggression to signaling group identity or the informality of the communication. This versatility of usage of a restricted set of words is challenging for downstream applications and has yet to be studied quantitatively or using natural language processing techniques. We introduce a novel data set of 7,800 tweets from users with known demographic traits where all instances of vulgar words are annotated with one of the six categories of vulgar word use. Using this data set, we present the first analysis of the pragmatic aspects of vulgarity and how they relate to social factors. We build a model able to predict the category of a vulgar word based on the immediate context it appears in with 67.4 macro F1 across six classes. Finally, we demonstrate the utility of modeling the type of vulgar word use in context by using this information to achieve state-of-the-art performance in hate speech detection on a benchmark data set.

Anthology ID:: D18-1471
Volume:: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:: October-November
Year:: 2018
Address:: Brussels, Belgium
Editors:: Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
Venue:: EMNLP
SIG:: SIGDAT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4405–4414
Language:
URL:: https://aclanthology.org/D18-1471/
DOI:: 10.18653/v1/D18-1471
Bibkey:
Cite (ACL):: Eric Holgate, Isabel Cachola, Daniel Preoţiuc-Pietro, and Junyi Jessy Li. 2018. Why Swear? Analyzing and Inferring the Intentions of Vulgar Expressions. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4405–4414, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):: Why Swear? Analyzing and Inferring the Intentions of Vulgar Expressions (Holgate et al., EMNLP 2018)
Copy Citation:
PDF:: https://aclanthology.org/D18-1471.pdf
Attachment:: D18-1471.Attachment.zip
Video:: https://aclanthology.org/D18-1471.mp4

PDF Cite Search Attachment Video Fix data