Nikolay Banar
2023
An Exploration of Zero-Shot Natural Language Inference-Based Hate Speech Detection
Nerses Yuzbashyan
|
Nikolay Banar
|
Ilia Markov
|
Walter Daelemans
Proceedings of the Third Workshop on Language Technology for Equality, Diversity and Inclusion
Conventional techniques for detecting online hate speech rely on the availability of a sufficient number of annotated instances, which can be costly and time consuming. For this reason, zero-shot or few-shot detection can offer an attractive alternative. In this paper, we explore a zero-shot detection approach based on natural language inference (NLI) models. Since the performance of the models in this approach depends heavily on the choice of a hypothesis, our goal is to determine which factors affect the quality of detection. We conducted a set of experiments with three NLI models and four hate speech datasets. We demonstrate that a zero-shot NLI-based approach is competitive with approaches that require supervised learning, yet they are highly sensitive to the choice of hypothesis. In addition, our experiments indicate that the results for a set of hypotheses on different model-data pairs are positively correlated, and that the correlation is higher for different datasets when using the same model than it is for different models when using the same dataset. These results suggest that if we find a hypothesis that works well for a specific model and domain or for a specific type of hate speech, we can use that hypothesis with the same model also within a different domain. While, another model might require different suitable hypotheses in order to demonstrate high performance.
2020
Neural Machine Translation of Artwork Titles Using Iconclass Codes
Nikolay Banar
|
Walter Daelemans
|
Mike Kestemont
Proceedings of the 4th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature
We investigate the use of Iconclass in the context of neural machine translation for NL<->EN artwork titles. Iconclass is a widely used iconographic classification system used in the cultural heritage domain to describe and retrieve subjects represented in the visual arts. The resource contains keywords and definitions to encode the presence of objects, people, events and ideas depicted in artworks, such as paintings. We propose a simple concatenation approach that improves the quality of automatically generated title translations for artworks, by leveraging textual information extracted from Iconclass. Our results demonstrate that a neural machine translation system is able to exploit this metadata to boost the translation performance of artwork titles. This technology enables interesting applications of machine learning in resource-scarce domains in the cultural sector.
Search