A Novel Negative Sample Generation Method for Contrastive Learning in Hierarchical Text Classification

Juncheng Zhou, Lijuan Zhang, Yachen He, Rongli Fan, Lei Zhang, Jian Wan


Abstract
Hierarchical text classification (HTC) is an important task in natural language processing (NLP). Existing methods typically utilize both text features and the hierarchical structure of labels to categorize text effectively. However, these approaches often struggle with fine-grained labels, which are closely similar, leading to difficulties in accurate classification. At the same time, contrastive learning has significant advantages in strengthening fine-grained label features and discrimination. However, the performance of contrastive learning strongly depends on the construction of negative samples. In this paper, we design a hierarchical sequence ranking (HiSR) method for generating diverse negative samples. These samples maximize the effectiveness of contrastive learning to enhance the ability of the model to distinguish between fine-grained labels and improve the performance of the model in HTC. Specifically, we transform the entire label set into linear sequences based on the hierarchical structure and rank these sequences according to their quality. During model training, the most suitable negative samples were dynamically selected from the ranked sequences. Then contrastive learning amplifies the differences between similar fine-grained labels by emphasizing the distinction between the ground truth and the generated negative samples, thereby enhancing the discriminative ability of the model. Our method has been tested on three public datasets and achieves state-of-art (SOTA) on two of them, demonstrating its effectiveness.
Anthology ID:
2025.coling-main.378
Volume:
Proceedings of the 31st International Conference on Computational Linguistics
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5645–5655
Language:
URL:
https://aclanthology.org/2025.coling-main.378/
DOI:
Bibkey:
Cite (ACL):
Juncheng Zhou, Lijuan Zhang, Yachen He, Rongli Fan, Lei Zhang, and Jian Wan. 2025. A Novel Negative Sample Generation Method for Contrastive Learning in Hierarchical Text Classification. In Proceedings of the 31st International Conference on Computational Linguistics, pages 5645–5655, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
A Novel Negative Sample Generation Method for Contrastive Learning in Hierarchical Text Classification (Zhou et al., COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-main.378.pdf