We leverage a BLSTM with attention to identify toxic spans in texts. We explore different dimensions which affect the model’s performance. The first dimension explored is the toxic set the model is trained on. Besides the provided dataset, we explore the transferability of 5 different toxic related sets, including offensive, toxic, abusive, and hate sets. We find that the solely offensive set shows the highest promise of transferability. The second dimension we explore is methodology, including leveraging attention, employing a greedy remove method, using a frequency ratio, and examining hybrid combinations of multiple methods. We conduct an error analysis to examine which types of toxic spans were missed and which were wrongly inferred as toxic along with the main reasons why they occurred. Finally, we extend our method via ensembles, which achieves our highest F1 score of 55.1.
NLP_UIOWA at SemEval-2020 Task 8: You’re Not the Only One Cursed with Knowledge - Multi Branch Model Memotion Analysis
Ingroj Shrestha | Jonathan Rusert
Proceedings of the Fourteenth Workshop on Semantic Evaluation
We propose hybrid models (HybridE and HybridW) for meme analysis (SemEval 2020 Task 8), which involves sentiment classification (Subtask A), humor classification (Subtask B), and scale of semantic classes (Subtask C). The hybrid model consists of BLSTM and CNN for text and image processing respectively. HybridE provides equal weight to BLSTM and CNN performance, while HybridW provides weightage based on the performance of BLSTM and CNN on a validation set. The performances (macro F1) of our hybrid model on Subtask A are 0.329 (HybridE), 0.328 (HybridW), on Subtask B are 0.507 (HybridE), 0.512 (HybridW), and on Subtask C are 0.309 (HybridE), 0.311 (HybridW).
This paper proposes a system for OffensEval (SemEval 2019 Task 6), which calls for a system to classify offensive language into several categories. Our system is a text based CNN, which learns only from the provided training data. Our system achieves 80 - 90% accuracy for the binary classification problems (offensive vs not offensive and targeted vs untargeted) and 63% accuracy for trinary classification (group vs individual vs other).