Using Attention-based Bidirectional LSTM to Identify Different Categories of Offensive Language Directed Toward Female Celebrities

Sima Sharifirad, Stan Matwin


Abstract
Social media posts reflect the emotions, intentions and mental state of the users. Twitter users who harass famous female figures may do so with different intentions and intensities. Recent studies have published datasets focusing on different types of online harassment, vulgar language, and emotional intensities. We trained, validate and test our proposed model, attention-based bidirectional neural network, on the three datasets:”online harassment”, “vulgar language” and “valance” and achieved state of the art performance in two of the datasets. We report F1 score for each dataset separately along with the final precision, recall and macro-averaged F1 score. In addition, we identify ten female figures from different professions and racial backgrounds who have experienced harassment on Twitter. We tested the trained models on ten collected corpuses each related to one famous female figure to predict the type of harassing language, the type of vulgar language and the degree of intensity of language occurring on their social platforms. Interestingly, the achieved results show different patterns of linguistic use targeting different racial background and occupations. The contribution of this study is two-fold. From the technical perspective, our proposed methodology is shown to be effective with a good margin in comparison to the previous state-of-the-art results on one of the two available datasets. From the social perspective, we introduce a methodology which can unlock facts about the nature of offensive language targeting women on online social platforms. The collected dataset will be shared publicly for further investigation.
Anthology ID:
W19-3616
Volume:
Proceedings of the 2019 Workshop on Widening NLP
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Amittai Axelrod, Diyi Yang, Rossana Cunha, Samira Shaikh, Zeerak Waseem
Venue:
WiNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
46–48
Language:
URL:
https://aclanthology.org/W19-3616
DOI:
Bibkey:
Cite (ACL):
Sima Sharifirad and Stan Matwin. 2019. Using Attention-based Bidirectional LSTM to Identify Different Categories of Offensive Language Directed Toward Female Celebrities. In Proceedings of the 2019 Workshop on Widening NLP, pages 46–48, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Using Attention-based Bidirectional LSTM to Identify Different Categories of Offensive Language Directed Toward Female Celebrities (Sharifirad & Matwin, WiNLP 2019)
Copy Citation: