NLP_Passau at SemEval-2020 Task 12: Multilingual Neural Network for Offensive Language Detection in English, Danish and Turkish

Omar Hussein, Hachem Sfar, Jelena Mitrović, Michael Granitzer


Abstract
This paper describes a neural network (NN) model that was used for participating in the OffensEval, Task 12 of the SemEval 2020 workshop. The aim of this task is to identify offensive speech in social media, particularly in tweets. The model we used, C-BiGRU, is composed of a Convolutional Neural Network (CNN) along with a bidirectional Recurrent Neural Network (RNN). A multidimensional numerical representation (embedding) for each of the words in the tweets that were used by the model were determined using fastText. This allowed for using a dataset of labeled tweets to train the model on detecting combinations of words that may convey an offensive meaning. This model was used in the sub-task A of the English, Turkish and Danish competitions of the workshop, achieving F1 scores of 90.88%, 76.76% and 76.70%, respectively.
Anthology ID:
2020.semeval-1.277
Volume:
Proceedings of the Fourteenth Workshop on Semantic Evaluation
Month:
December
Year:
2020
Address:
Barcelona (online)
Venue:
SemEval
SIGs:
SIGSEM | SIGLEX
Publisher:
International Committee for Computational Linguistics
Note:
Pages:
2090–2097
Language:
URL:
https://aclanthology.org/2020.semeval-1.277
DOI:
10.18653/v1/2020.semeval-1.277
Bibkey:
Cite (ACL):
Omar Hussein, Hachem Sfar, Jelena Mitrović, and Michael Granitzer. 2020. NLP_Passau at SemEval-2020 Task 12: Multilingual Neural Network for Offensive Language Detection in English, Danish and Turkish. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, pages 2090–2097, Barcelona (online). International Committee for Computational Linguistics.
Cite (Informal):
NLP_Passau at SemEval-2020 Task 12: Multilingual Neural Network for Offensive Language Detection in English, Danish and Turkish (Hussein et al., SemEval 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.semeval-1.277.pdf
Data
OLID