Tianming Jiang


2025

pdf bib
Learn from Failure: Causality-guided Contrastive Learning for Generalizable Implicit Hate Speech Detection
Tianming Jiang
Proceedings of the 31st International Conference on Computational Linguistics

Implicit hate speech presents a significant challenge for automatic detection systems due to its subtlety and ambiguity. Traditional models trained using empirical risk minimization (ERM) often rely on correlations between class labels and spurious attributes, which leads to poor performance on data lacking these correlations. In this paper, we propose a novel approach using causality-guided contrastive learning (CCL) to enhance the generalizability of implicit hate speech detection. Since ERM tends to identify spurious attributes, CCL works by aligning the representations of samples with the same class but opposite spurious attributes, identified through ERM’s inference failure. This method reduces the model’s reliance on spurious correlations, allowing it to learn more robust features and handle diverse, nuanced contexts better. Our extensive experiments on multiple implicit hate speech datasets show that our approach outperforms current state-of-the-art methods in cross-domain generalization.
Search
Co-authors
    Venues
    Fix data