Proceedings of the First Workshop on Abusive Language Online

Proceedings of the First Workshop on Abusive Language Online Zeerak Waseem Wendy Hui Kyong Chung Dirk Hovy Joel Tetreault August 2017

Vancouver, BC, Canada

Association for Computational Linguistics http://www.aclweb.org/anthology/W17-30 book ALW1:2017 Dimensions of Abusive Language on Twitter IsobelleClarke Dr. JackGrieve Proceedings of the First Workshop on Abusive Language Online August 2017

Vancouver, BC, Canada

Association for Computational Linguistics 1–10 http://www.aclweb.org/anthology/W17-3001 In this paper, we use a new categorical form of multidimensional register analysis to identify the main dimensions of functional linguistic variation in a corpus of abusive language, consisting of racist and sexist Tweets. By analysing the use of a wide variety of parts-of-speech and grammatical constructions, as well as various features related to Twitter and computer-mediated communication, we discover three dimensions of linguistic variation in this corpus, which we interpret as being related to the degree of interactive, antagonistic and attitudinal language exhibited by individual Tweets. We then demonstrate that there is a significant functional difference between racist and sexist Tweets, with sexists Tweets tending to be more interactive and attitudinal than racist Tweets. inproceedings clarke-grieve:2017:ALW1 Constructive Language in News Comments VaradaKolhatkar MaiteTaboada Proceedings of the First Workshop on Abusive Language Online August 2017

Vancouver, BC, Canada

Association for Computational Linguistics 11–17 http://www.aclweb.org/anthology/W17-3002 We discuss the characteristics of constructive news comments, and present methods to identify them. First, we define the notion of constructiveness. Second, we annotate a corpus for constructiveness. Third, we explore whether available argumentation corpora can be useful to identify constructiveness in news comments. Our model trained on argumentation corpora achieves a top accuracy of 72.59% (baseline=49.44%) on our crowd-annotated test data. Finally, we examine the relation between constructiveness and toxicity. In our crowd-annotated data, 21.42% of the non-constructive comments and 17.89% of the constructive comments are toxic, suggesting that non-constructive comments are not much more toxic than constructive comments. inproceedings kolhatkar-taboada:2017:ALW1 Rephrasing Profanity in Chinese Text Hui-PoSu Zhen-JieHuang Hao-TsungChang Chuan-JieLin Proceedings of the First Workshop on Abusive Language Online August 2017

Vancouver, BC, Canada

Association for Computational Linguistics 18–24 http://www.aclweb.org/anthology/W17-3003 This paper proposes a system that can detect and rephrase profanity in Chinese text. Rather than just masking detected profanity, we want to revise the input sentence by using inoffensive words while keeping their original meanings. 29 of such rephrasing rules were invented after observing sentences on real-word social websites. The overall accuracy of the proposed system is 85.56% inproceedings su-EtAl:2017:ALW1 Deep Learning for User Comment Moderation JohnPavlopoulos ProdromosMalakasiotis IonAndroutsopoulos Proceedings of the First Workshop on Abusive Language Online August 2017

Vancouver, BC, Canada

Association for Computational Linguistics 25–35 http://www.aclweb.org/anthology/W17-3004 Experimenting with a new dataset of 1.6M user comments from a Greek news portal and existing datasets of EnglishWikipedia comments, we show that an RNN outperforms the previous state of the art in moderation. A deep, classification-specific attention mechanism improves further the overall performance of the RNN. We also compare against a CNN and a word-list baseline, considering both fully automatic and semi-automatic moderation. inproceedings pavlopoulos-malakasiotis-androutsopoulos:2017:ALW1 Class-based Prediction Errors to Detect Hate Speech with Out-of-vocabulary Words JoanSerrà IliasLeontiadis DimitrisSpathis GianlucaStringhini JeremyBlackburn AthenaVakali Proceedings of the First Workshop on Abusive Language Online August 2017

Vancouver, BC, Canada

Association for Computational Linguistics 36–40 http://www.aclweb.org/anthology/W17-3005 Common approaches to text categorization essentially rely either on n-gram counts or on word embeddings. This presents important difficulties in highly dynamic or quickly-interacting environments, where the appearance of new words and/or varied misspellings is the norm. A paradigmatic example of this situation is abusive online behavior, with social networks and media platforms struggling to effectively combat uncommon or non-blacklisted hate words. To better deal with these issues in those fast-paced environments, we propose using the error signal of class-based language models as input to text classification algorithms. In particular, we train a next-character prediction model for any given class and then exploit the error of such class-based models to inform a neural network classifier. This way, we shift from the ‘ability to describe’ seen documents to the ‘ability to predict’ unseen content. Preliminary studies using out-of-vocabulary splits from abusive tweet data show promising results, outperforming competitive text categorization strategies by 4-11%. inproceedings serra-EtAl:2017:ALW1 One-step and Two-step Classification for Abusive Language Detection on Twitter Ji HoPark PascaleFung Proceedings of the First Workshop on Abusive Language Online August 2017

Vancouver, BC, Canada

Association for Computational Linguistics 41–45 http://www.aclweb.org/anthology/W17-3006 Automatic abusive language detection is a difficult but important task for online social media. Our research explores a two-step approach of performing classification on abusive language and then classifying into specific types and compares it with one-step approach of doing one multi-class classification for detecting sexist and racist languages. With a public English Twitter corpus of 20 thousand tweets in the type of sexism and racism, our approach shows a promising performance of 0.827 F-measure by using HybridCNN in one-step and 0.824 F-measure by using logistic regression in two-steps. Author{2}Affiliation inproceedings park-fung:2017:ALW1 Legal Framework, Dataset and Annotation Schema for Socially Unacceptable Online Discourse Practices in Slovene DarjaFišer TomažErjavec NikolaLjubešić Proceedings of the First Workshop on Abusive Language Online August 2017

Vancouver, BC, Canada

Association for Computational Linguistics 46–51 http://www.aclweb.org/anthology/W17-3007 In this paper we present the legal framework, dataset and annotation schema of socially unacceptable discourse practices on social networking platforms in Slovenia. On this basis we aim to train an automatic identification and classification system with which we wish contribute towards an improved methodology, understanding and treatment of such practices in the contemporary, increasingly multicultural information society. inproceedings fivser-erjavec-ljubevsic:2017:ALW1 Abusive Language Detection on Arabic Social Media HamdyMubarak KareemDarwish WalidMagdy Proceedings of the First Workshop on Abusive Language Online August 2017

Vancouver, BC, Canada

Association for Computational Linguistics 52–56 http://www.aclweb.org/anthology/W17-3008 In this paper, we present our work on detecting abusive language on Arabic social media. We extract a list of obscene words and hashtags using common patterns used in offensive and rude communications. We also classify Twitter users according to whether they use any of these words or not in their tweets. We expand the list of obscene words using this classification, and we report results on a newly created dataset of classified Arabic tweets (obscene, offensive, and clean). We make this dataset freely available for research, in addition to the list of obscene words and hashtags. We are also publicly releasing a large corpus of classified user comments that were deleted from a popular Arabic news site due to violations the site’s rules and guidelines. inproceedings mubarak-darwish-magdy:2017:ALW1 Vectors for Counterspeech on Twitter LucasWright DerekRuths Kelly PDillon Haji MohammadSaleem SusanBenesch Proceedings of the First Workshop on Abusive Language Online August 2017

Vancouver, BC, Canada

Association for Computational Linguistics 57–62 http://www.aclweb.org/anthology/W17-3009 A study of conversations on Twitter found that some arguments between strangers led to favorable change in discourse and even in attitudes. The authors propose that such exchanges can be usefully distinguished according to whether individuals or groups take part on each side, since the opportunity for a constructive exchange of views seems to vary accordingly. inproceedings wright-EtAl:2017:ALW1 Detecting Nastiness in Social Media NiloofarSafi Samghabadi SurajMaharjan AlanSprague RaquelDiaz-Sprague ThamarSolorio Proceedings of the First Workshop on Abusive Language Online August 2017

Vancouver, BC, Canada

Association for Computational Linguistics 63–72 http://www.aclweb.org/anthology/W17-3010 Although social media has made it easy for people to connect on a virtually unlimited basis, it has also opened doors to people who misuse it to undermine, harass, humiliate, threaten and bully others. There is a lack of adequate resources to detect and hinder its occurrence. In this paper, we present our initial NLP approach to detect invective posts as a first step to eventually detect and deter cyberbullying. We crawl data containing profanities and then determine whether or not it contains invective. Annotations on this data are improved iteratively by in-lab annotations and crowdsourcing. We pursue different NLP approaches containing various typical and some newer techniques to distinguish the use of swear words in a neutral way from those instances in which they are used in an insulting way. We also show that this model not only works for our data set, but also can be successfully applied to different data sets. inproceedings safisamghabadi-EtAl:2017:ALW1 Technology Solutions to Combat Online Harassment GeorgeKennedy AndrewMcCollough EdwardDixon AlexeiBastidas JohnRyan ChrisLoo SauravSahay Proceedings of the First Workshop on Abusive Language Online August 2017

Vancouver, BC, Canada

Association for Computational Linguistics 73–77 http://www.aclweb.org/anthology/W17-3011 This work is part of a new initiative to use machine learning to identify online harassment in social media and comment streams. Online harassment goes under-reported due to the reliance on humans to identify and report harassment, reporting that is further slowed by requirements to fill out forms providing context. In addition, the time for moderators to respond and apply human judgment can take days, but response times in terms of minutes are needed in the online context. Though some of the major social media companies have been doing proprietary work in automating the detection of harassment, there are few tools available for use by the public. In addition, the amount of labeled online harassment data and availability of cross-platform online harassment datasets is limited. We present the methodology used to create a harassment dataset and classifier and the dataset used to help the system learn what harassment looks like. inproceedings kennedy-EtAl:2017:ALW1 Understanding Abuse: A Typology of Abusive Language Detection Subtasks ZeerakWaseem ThomasDavidson DanaWarmsley IngmarWeber Proceedings of the First Workshop on Abusive Language Online August 2017

Vancouver, BC, Canada

Association for Computational Linguistics 78–84 http://www.aclweb.org/anthology/W17-3012 As the body of research on abusive language detection and analysis grows, there is a need for critical consideration of the relationships between different subtasks that have been grouped under this label. Based on work on hate speech, cyberbullying, and online abuse we propose a typology that captures central similarities and differences between subtasks and discuss the implications of this for data annotation and feature construction. We emphasize the practical actions that can be taken by researchers to best approach their abusive language detection subtask of interest. inproceedings waseem-EtAl:2017:ALW1 Using Convolutional Neural Networks to Classify Hate-Speech BjörnGambäck Utpal KumarSikdar Proceedings of the First Workshop on Abusive Language Online August 2017

Vancouver, BC, Canada

Association for Computational Linguistics 85–90 http://www.aclweb.org/anthology/W17-3013 The paper introduces a deep learning-based Twitter hate-speech text classification system. The classifier assigns each tweet to one of four predefined categories: racism, sexism, both (racism and sexism) and non-hate-speech. Four Convolutional Neural Network models were trained on resp. character 4-grams, word vectors based on semantic information built using word2vec, randomly generated word vectors, and word vectors combined with character n-grams. The feature set was down-sized in the networks by max-pooling, and a softmax function used to classify tweets. Tested by 10-fold cross-validation, the model based on word2vec embeddings performed best, with higher precision than recall, and a 78.3% F-score. inproceedings gamback-sikdar:2017:ALW1 Illegal is not a Noun: Linguistic Form for Detection of Pejorative Nominalizations AlexisPalmer MelissaRobinson Kristy K.Phillips Proceedings of the First Workshop on Abusive Language Online August 2017

Vancouver, BC, Canada

Association for Computational Linguistics 91–100 http://www.aclweb.org/anthology/W17-3014 This paper focuses on a particular type of abusive language, targeting expressions in which typically neutral adjectives take on pejorative meaning when used as nouns - compare 'gay people' to 'the gays'. We first collect and analyze a corpus of hand-curated, expert-annotated pejorative nominalizations for four target adjectives: female, gay, illegal, and poor. We then collect a second corpus of automatically-extracted and POS-tagged, crowd-annotated tweets. For both corpora, we find support for the hypothesis that some adjectives, when nominalized, take on negative meaning. The targeted constructions are non-standard yet widely-used, and part-of-speech taggers mistag some nominal forms as adjectives. We implement a tool called NomCatcher to correct these mistaggings, and find that the same tool is effective for identifying new adjectives subject to transformation via nominalization into abusive language. inproceedings palmer-robinson-phillips:2017:ALW1