Alicia Breidenstein
2024
Using Locally Learnt Word Representations for better Textual Anomaly Detection
Alicia Breidenstein
|
Matthieu Labeau
Proceedings of the Fifth Workshop on Insights from Negative Results in NLP
The literature on general purpose textual Anomaly Detection is quite sparse, as most textual anomaly detection methods are implemented as out of domain detection in the context of pre-established classification tasks. Notably, in a field where pre-trained representations and models are of common use, the impact of the pre-training data on a task that lacks supervision has not been studied. In this paper, we use the simple setting of k-classes out anomaly detection and search for the best pairing of representation and classifier. We show that well-chosen embeddings allow a simple anomaly detection baseline such as OC-SVM to achieve similar results and even outperform deep state-of-the-art models.