A Computational Exploration of Pejorative Language in Social Media

Liviu P. Dinu, Ioan-Bogdan Iordache, Ana Sabina Uban, Marcos Zampieri


Abstract
In this paper we study pejorative language, an under-explored topic in computational linguistics. Unlike existing models of offensive language and hate speech, pejorative language manifests itself primarily at the lexical level, and describes a word that is used with a negative connotation, making it different from offensive language or other more studied categories. Pejorativity is also context-dependent: the same word can be used with or without pejorative connotations, thus pejorativity detection is essentially a problem similar to word sense disambiguation. We leverage online dictionaries to build a multilingual lexicon of pejorative terms for English, Spanish, Italian, and Romanian. We additionally release a dataset of tweets annotated for pejorative use. Based on these resources, we present an analysis of the usage and occurrence of pejorative words in social media, and present an attempt to automatically disambiguate pejorative usage in our dataset.
Anthology ID:
2021.findings-emnlp.296
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2021
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic
Venues:
EMNLP | Findings
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
3493–3498
Language:
URL:
https://aclanthology.org/2021.findings-emnlp.296
DOI:
10.18653/v1/2021.findings-emnlp.296
Bibkey:
Cite (ACL):
Liviu P. Dinu, Ioan-Bogdan Iordache, Ana Sabina Uban, and Marcos Zampieri. 2021. A Computational Exploration of Pejorative Language in Social Media. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 3493–3498, Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
A Computational Exploration of Pejorative Language in Social Media (Dinu et al., Findings 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.findings-emnlp.296.pdf
Data
Hate Speech and Offensive Language