Jian Wan
2025
A Novel Negative Sample Generation Method for Contrastive Learning in Hierarchical Text Classification
Juncheng Zhou
|
Lijuan Zhang
|
Yachen He
|
Rongli Fan
|
Lei Zhang
|
Jian Wan
Proceedings of the 31st International Conference on Computational Linguistics
Hierarchical text classification (HTC) is an important task in natural language processing (NLP). Existing methods typically utilize both text features and the hierarchical structure of labels to categorize text effectively. However, these approaches often struggle with fine-grained labels, which are closely similar, leading to difficulties in accurate classification. At the same time, contrastive learning has significant advantages in strengthening fine-grained label features and discrimination. However, the performance of contrastive learning strongly depends on the construction of negative samples. In this paper, we design a hierarchical sequence ranking (HiSR) method for generating diverse negative samples. These samples maximize the effectiveness of contrastive learning to enhance the ability of the model to distinguish between fine-grained labels and improve the performance of the model in HTC. Specifically, we transform the entire label set into linear sequences based on the hierarchical structure and rank these sequences according to their quality. During model training, the most suitable negative samples were dynamically selected from the ranked sequences. Then contrastive learning amplifies the differences between similar fine-grained labels by emphasizing the distinction between the ground truth and the generated negative samples, thereby enhancing the discriminative ability of the model. Our method has been tested on three public datasets and achieves state-of-art (SOTA) on two of them, demonstrating its effectiveness.
2022
Attention and Edge-Label Guided Graph Convolutional Networks for Named Entity Recognition
Renjie Zhou
|
Zhongyi Xie
|
Jian Wan
|
Jilin Zhang
|
Yong Liao
|
Qiang Liu
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
It has been shown that named entity recognition (NER) could benefit from incorporating the long-distance structured information captured by dependency trees. However, dependency trees built by tools usually have a certain percentage of errors. Under such circumstances, how to better use relevant structured information while ignoring irrelevant or wrong structured information from the dependency trees to improve NER performance is still a challenging research problem. In this paper, we propose the Attention and Edge-Label guided Graph Convolution Network (AELGCN) model. Then, we integrate it into BiLSTM-CRF to form BiLSTM-AELGCN-CRF model. We design an edge-aware node joint update module and introduce a node-aware edge update module to explore hidden in structured information entirely and solve the wrong dependency label information to some extent. After two modules, we apply attention-guided GCN, which automatically learns how to attend to the relevant structured information selectively. We conduct extensive experiments on several standard datasets across four languages and achieve better results than previous approaches. Through experimental analysis, it is found that our proposed model can better exploit the structured information on the dependency tree to improve the recognition of long entities.