Logits Reranking via Semantic Labels for Hard Samples in Text Classification

Peijie Huang; Junbao Huang; Yuhong Xu (徐禹洪); Weizhen Li; Xisheng Xiao

doi:10.18653/v1/2024.findings-emnlp.657

Logits Reranking via Semantic Labels for Hard Samples in Text Classification

Peijie Huang, Junbao Huang, Yuhong Xu, Weizhen Li, Xisheng Xiao

Abstract

Pre-trained Language Models (PLMs) have achieved significant success in text classification. However, they still face challenges with hard samples, which refer to instances where the model exhibits diminished confidence in distinguishing new samples. Existing research has addressed related issues, but often overlooks the semantic information inherent in the labels, treating them merely as one-hot vectors. In this paper, we propose Logits Reranking via Semantic Labels (LRSL), a model-agnostic post-processing method that leverages label semantics and auto detection of hard samples to improve classification accuracy. LRSL automatically identifies hard samples, which are then jointly processed by MLP-based and Similarity-based approaches. Applied only during inference, LRSL operates solely on classification logits, reranking them based on semantic similarities without interfering with the model’s training process. The experiments demonstrate the effectiveness of our method, showing significant improvements across different PLMs. Our codes are publicly available at https://github.com/SIGSDSscau/LRSL.

Anthology ID:: 2024.findings-emnlp.657
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2024
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 11250–11262
Language:
URL:: https://aclanthology.org/2024.findings-emnlp.657/
DOI:: 10.18653/v1/2024.findings-emnlp.657
Bibkey:
Cite (ACL):: Peijie Huang, Junbao Huang, Yuhong Xu, Weizhen Li, and Xisheng Xiao. 2024. Logits Reranking via Semantic Labels for Hard Samples in Text Classification. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 11250–11262, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: Logits Reranking via Semantic Labels for Hard Samples in Text Classification (Huang et al., Findings 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.findings-emnlp.657.pdf

PDF Cite Search Fix data