Improving Unsupervised Out-of-domain detection through Pseudo Labeling and Learning
Byounghan Lee | Jaesik Kim | Junekyu Park | Kyung-Ah Sohn
Findings of the Association for Computational Linguistics: EACL 2023
Unsupervised out-of-domain (OOD) detection is a task aimed at discriminating whether given samples are from the in-domain or not, without the categorical labels of in-domain instances. Unlike supervised OOD, as there are no labels for training a classifier, previous works on unsupervised OOD detection adopted the one-class classification (OCC) approach, assuming that the training samples come from a single domain. However, in-domain instances in many real world applications can have a heterogeneous distribution (i.e., across multiple domains or multiple classes). In this case, OCC methods have difficulty in reflecting the categorical information of the domain properly. To tackle this issue, we propose a two-stage framework that leverages the latent categorical information to improve representation learning for textual OOD detection. In the first stage, we train a transformer-based sentence encoder for pseudo labeling by contrastive loss and cluster loss. The second stage is pseudo label learning in which the model is re-trained with pseudo-labels obtained in the first stage. The empirical results on the three datasets show that our two-stage framework significantly outperforms baseline models in more challenging scenarios.
Why Is It Hate Speech? Masked Rationale Prediction for Explainable Hate Speech Detection
Jiyun Kim | Byounghan Lee | Kyung-Ah Sohn
Proceedings of the 29th International Conference on Computational Linguistics
In a hate speech detection model, we should consider two critical aspects in addition to detection performance–bias and explainability. Hate speech cannot be identified based solely on the presence of specific words; the model should be able to reason like humans and be explainable. To improve the performance concerning the two aspects, we propose Masked Rationale Prediction (MRP) as an intermediate task. MRP is a task to predict the masked human rationales–snippets of a sentence that are grounds for human judgment–by referring to surrounding tokens combined with their unmasked rationales. As the model learns its reasoning ability based on rationales by MRP, it performs hate speech detection robustly in terms of bias and explainability. The proposed method generally achieves state-of-the-art performance in various metrics, demonstrating its effectiveness for hate speech detection. Warning: This paper contains samples that may be upsetting.
How Positive Are You: Text Style Transfer using Adaptive Style Embedding
Heejin Kim | Kyung-Ah Sohn
Proceedings of the 28th International Conference on Computational Linguistics
The prevalent approach for unsupervised text style transfer is disentanglement between content and style. However, it is difficult to completely separate style information from the content. Other approaches allow the latent text representation to contain style and the target style to affect the generated output more than the latent representation does. In both approaches, however, it is impossible to adjust the strength of the style in the generated output. Moreover, those previous approaches typically perform both the sentence reconstruction and style control tasks in a single model, which complicates the overall architecture. In this paper, we address these issues by separating the model into a sentence reconstruction module and a style module. We use the Transformer-based autoencoder model for sentence reconstruction and the adaptive style embedding is learned directly in the style module. Because of this separation, each module can better focus on its own task. Moreover, we can vary the style strength of the generated sentence by changing the style of the embedding expression. Therefore, our approach not only controls the strength of the style, but also simplifies the model architecture. Experimental results show that our approach achieves better style transfer performance and content preservation than previous approaches.