Pin_cod_ at SemEval-2020 Task 12: Injecting Lexicons into Bidirectional Long Short-Term Memory Networks to Detect Turkish Offensive Tweets

This paper describes a system (pin_cod_) built for SemEval 2020 Task 12: OffensEval: Multilingual Offensive Language Identification in Social Media (Zampieri et al., 2020). I present the system based on the architecture of bidirectional long short-term memory networks (BiLSTM) concatenated with lexicon-based features and a social-network specific feature and then followed by two fully connected dense layers for detecting Turkish offensive tweets. The pin cod ’s system achieved a macro F1-score of 0.7496 for Sub-task A - Offensive language identification in Turkish.


Introduction
With the appearance of influential social media platforms, offensive language is becoming more prevalent and visible. Harbouring behind a physically invisible author, bullies-either an intimate partner or an absolute stranger to victims-spread abusive, offensive and hateful messages against a particular person or a group of people through many internet platforms. The scope of the victims is broad and especially teenagers, women and immigrants are among targets for bullies. Online bullying leads to mental health issues including anxiety disorder, depression, it reduces self confidence, and causes lower academic achievement.
Computational approaches to identify online bullying, hate speech, offensive and abusive language have gained acceleration and attention as a number of workshops have been organized for this purpose (Bosco et al., 2018;Fersini et al., 2018;Basile et al., 2019;Zampieri et al., 2019) towards generic users, women and/or immigrants in English, Italian, Spanish and German. SemEval 2020 Task 12 : OffensEval 2020 is probably the first workshop for identifying offensive language for Turkish language with Sub-task A -Offensive language identification (Çöltekin, 2020). I participated in this sub-task the main goal of which is to segregate offensive posts from not offensive ones (i.e., OFF for offensive, NOT for not offensive class). Offensive posts comprise insults, threats, and untargeted profanity. Non-offensive posts do not contain any offense or profanity. The pin cod 's system for this binary classification sub-task was based on BiLSTM networks incorporated with various lexicon-based features and the presence of user mentions (e.g. @username) and then followed by two fully connected dense layers. The scripts for preprocessing and the BiLSTM model can be found here 1 . The current study shows the features obtained from several lexicons and a social-network specific feature increased the performance of the BiLSTM model with a satisfactory F1-score of 0.7496. This paper is organized as follows: related work in section 2, system description in section 3, results in section 4, error analysis and discussion in section 5, and conclusions in section 6.

System Description
I followed a supervised learning approach to identify offensive language in Turkish. I implemented Bidirectional Long Short-Term Memory Networks using word embeddings on the Turkish Twitter dataset provided by the organizer in SemEval-2020 Task 12 OffensEval: Sub-task A -Offensive language identification (Çöltekin, 2020). To predict the labels for the given test set consisting of 3528 unlabeled tweets, I only used the provided training set containing 31756 labeled tweets which was split into two parts: 80% for training set and 20% for validation set.

Feature Description
The following text-driven features were used in the final model: • Word embeddings: Turkish fastText embeddings (Joulin et al., 2018) 2 were employed in the BiLSTM model.
• Sentiment features: NRC Emotion Lexicon also called EmoLex (Mohammad and Turney, 2013) was used for counting the number of negative words per tweet. The EmoLex provides word-level emotion (i.e., anger, fear, anticipation, trust, surprise, sadness, joy, disgust) and sentiment (i.e., negative, positive) tags for Turkish emotion and sentiment bearing words. Upon a detailed investigation to unveil the contribution of the tags on the pin cod 's system, only the negative tag was decided to be used as a feature.
• Hate speech-related features: I used the HurtLex lexicon (Bassignana et al., 2018), which consists of negative stereotypes, hate words, slurs beyond stereotypes and other words and insults, for Turkish. Upon applying a detailed experimentation to reveal the sub-categories of HurtLex increasing the performance of the pin cod 's system, the following HurtLex sub-categories were used in the final system: DDP: cognitive disabilities and diversity, DMC: moral and behavioral defects, OR: plants, ASM: male genitalia, ASF: female genitalia, OM: words related to homosexuality, CDS: derogatory words. The number of each of these selected categories per tweet was used as a feature.
• Offensive/Profane/Slang word lists: I compiled various wordlists 3 . Revising each word in these lists, if necessary, I made some additions or removals (e.g. internet slangs with positive polarity 'panpa', 'kanks' which both mean 'mate', 'lads', 'dudes' and are used to refer a friend were removed.) Then, I checked the presence of the offensive, profane and slang words per tweet. If present, the number of these words was counted per tweet and used as a feature.
• Social-network specific feature: The presence of mention tags per tweet was used as a feature. If a tweet contains a mention tag, it is represented with 1, otherwise, shown as 0.

Bidirectional Long Short-Term Networks
The core of my offensive language classification model was a bidirectional recurrent neural network, specifically bidirectional long short-term memory networks (Hochreiter and Schmidhuber, 1997) which is one of the most popular architectures used in natural language processing tasks such as text classification. BiLSTM was chosen since it provides further context to the network by training one LSTM for the input sequence and the second one for the reversed copy of the input sequence. FastText embeddings were employed for the BiLSTM model which was implemented based on Keras 2.3.1 using TensorFlow 2.1.0 backend. The script was written in Python 3.5.9 on macOS Catalina (version 10.15.3). Dimensional vectors were set to 300. Input vector length was set to 82. Upon applying some grid search on the number of neurons, dropout regulations, number of epochs, batch sizes, optimization algorithms, activation functions and patience in the early stopping, the number of neurons in the BiLSTM layer was set to 200, both dropout and recurrent dropout were set to 0.2. The recurrent layer was concatenated with the dense text-driven features stated in the subsection 3.2. The concatenated layer was then fed to two fully connected dense layers with 100 neurons and rectified linear units activation function each. Final layer had 2 units representing the number of classes to be predicted and 'sigmoid' activation function which returned the probabilities of each class between the range of 0 and 1. I compiled the offensive language classification model by using 'binary crossentropy' loss, 'adam' optimizer and 'accuracy' metrics. Some callbacks, a set of functions to be applied at given stages of the training procedure, were applied. Validation accuracy was monitored and the number of epochs with no improvement after which training would be stopped was set to 5. The latest best model based on the monitored validation accuracy was saved. Then, the labels (NOT or OFF) were predicted for the test set.

Results
For OffensEval 2020: Sub-task A -Offensive language identification in Turkish, the architecture was built based on the bidirectional long short-term memory networks concatenated with a social-network specific feature, sentiment feature, lexicon-based features and word embeddings. Then, it was followed by two fully connected layers so that the test set consisting of Turkish tweets were predicted as either offensive (OFF) or not offensive (NOT). I did not use any external datasets apart from the training set provided by the organizer. The results of pin cod 's participation in the Sub-task A of OffensEval task on the test set are presented in Table 1. The confusion matrix for the test set classification is displayed in Table 2  The official evaluation metric in OffensEval 2020 was macro f1-score. The pin cod 's system ranked 23rd among 46 participants shown in Table 3.

Error Analysis and Discussion
The pin cod 's system detecting Turkish offensive tweets achieved a satisfactory result. Yet some misclassifications were obtained as presented in Table 2. The system yielded more false negatives than  Table 3: pin cod 's system results relevant to other selected systems in sub-task A of OffensEval task for Turkish.
false positives, which could stem from the fact that the majority class of both training and test sets was not offensive (i.e. NOT). The misclassifications emanated from false negatives, which means that true label is offensive but the system predicts as not offensive, were mostly related to the following specific phenomena: (i) sarcasm (e.g. 'İmparator büyük takımlar dedi zaten lig 4 uncusunden bahsetmedi ki' -English translation: 'The emperor said big teams, he did not mention the fourth of the league anyway'), Some inconsistencies and incorrect annotations were also noticed in the gold labels. The tweet 'Yeeni yıla bak anasını satıyım6 Günündeeyiz 5 günü tatildiGeeldee seevmee böylee yılıafeerin yeeni yılafeerin' meaning 'Look at the new year what the heck we are on the 6th day it was 5 days of vacation how can I not love such a year well done new year well done' had an offensive label. In fact, it carries a positive meaning although it consists of a slang word. Some words such as 'manyak' meaning 'maniac' existed in both offensive and non-offensive tweets. Semantically similar contexts containing this word were labeled differently in the gold standard, which might have caused the system to misclassify some tweets. For instance, 'Bana sevgili degilher şeyi beraber yapabilecegim manyak bi insan lazım' meaning 'I do not need a sweetheart I need a maniac person with whom I can do everything together' and 'Manyak işte Allah bilir kafaya neyi takmıştır.' meaning 'Maniac huh God knows what was in his mind.' were both labeled as offensive, while 'İnsan sevdigi için tescilli manyak olabiliyorsa seviyordur bizi sınamayın' meaning 'If a person can become a registered maniac for his sweetheart, he loves her do not try us' was labeled as not offensive.
A conclusion might be made that an algorithmic model of detecting offensive language cannot be solely limited to detecting bad words or slang words. Certain insulting terms might sound differently to a friend or an absolute stranger. Other factors such as the use of emojis along with such insulting terms might make a message offensive or not offensive. On the contrary, highly offensive messages might not necessarily include any toxic and hurtful words while they contain some metaphors or metonymies. If the predicting model is restricted to the textual analysis of content, it will likely boost the chances of yielding false negatives and false positives.

Conclusions
In this paper, I presented the pin cod 's system I have developed as part of my participation in SemEval-2020 Task 12: OffensEval 2020: Multilingual Offensive Language Identification in Social Media. Specifically, I have participated in Sub-task A -Offensive language identification for Turkish language. To bring a solution for this task, I adopted bidirectional long short-term memory neural networks incorporating lexical features from a polarity lexicon, hate speech-related lexicon and a compiled offensive/profane/slang wordlist as well as a social-network specific feature. Then, this concatenated layer was fed to two fully connnected dense layers. Although a satisfactory result is obtained for this task, I plan to improve the system's performance by adopting knowledge base and more social-network specific features in the future.