Jana Kampfmeier


2023

pdf bib
Euphemistic Abuse – A New Dataset and Classification Experiments for Implicitly Abusive Language
Michael Wiegand | Jana Kampfmeier | Elisabeth Eder | Josef Ruppenhofer
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

We address the task of identifying euphemistic abuse (e.g. “You inspire me to fall asleep”) paraphrasing simple explicitly abusive utterances (e.g. “You are boring”). For this task, we introduce a novel dataset that has been created via crowdsourcing. Special attention has been paid to the generation of appropriate negative (non-abusive) data. We report on classification experiments showing that classifiers trained on previous datasets are less capable of detecting such abuse. Best automatic results are obtained by a classifier that augments training data from our new dataset with automatically-generated GPT-3 completions. We also present a classifier that combines a few manually extracted features that exemplify the major linguistic phenomena constituting euphemistic abuse.