Nerses Yuzbashyan
2023
An Exploration of Zero-Shot Natural Language Inference-Based Hate Speech Detection
Nerses Yuzbashyan
|
Nikolay Banar
|
Ilia Markov
|
Walter Daelemans
Proceedings of the Third Workshop on Language Technology for Equality, Diversity and Inclusion
Conventional techniques for detecting online hate speech rely on the availability of a sufficient number of annotated instances, which can be costly and time consuming. For this reason, zero-shot or few-shot detection can offer an attractive alternative. In this paper, we explore a zero-shot detection approach based on natural language inference (NLI) models. Since the performance of the models in this approach depends heavily on the choice of a hypothesis, our goal is to determine which factors affect the quality of detection. We conducted a set of experiments with three NLI models and four hate speech datasets. We demonstrate that a zero-shot NLI-based approach is competitive with approaches that require supervised learning, yet they are highly sensitive to the choice of hypothesis. In addition, our experiments indicate that the results for a set of hypotheses on different model-data pairs are positively correlated, and that the correlation is higher for different datasets when using the same model than it is for different models when using the same dataset. These results suggest that if we find a hypothesis that works well for a specific model and domain or for a specific type of hate speech, we can use that hypothesis with the same model also within a different domain. While, another model might require different suitable hypotheses in order to demonstrate high performance.
Search