Dong Zhou


2023

pdf bib
An Effective Deployment of Contrastive Learning in Multi-label Text Classification
Nankai Lin | Guanqiu Qin | Gang Wang | Dong Zhou | Aimin Yang
Findings of the Association for Computational Linguistics: ACL 2023

The effectiveness of contrastive learning technology in natural language processing tasks is yet to be explored and analyzed. How to construct positive and negative samples correctly and reasonably is the core challenge of contrastive learning. It is even harder to discover contrastive objects in multi-label text classification tasks. There are very few contrastive losses proposed previously. In this paper, we investigate the problem from a different angle by proposing five novel contrastive losses for multi-label text classification tasks. These are Strict Contrastive Loss (SCL), Intra-label Contrastive Loss (ICL), Jaccard Similarity Contrastive Loss (JSCL), Jaccard Similarity Probability Contrastive Loss (JSPCL), and Stepwise Label Contrastive Loss (SLCL). We explore the effectiveness of contrastive learning for multi-label text classification tasks by the employment of these novel losses and provide a set of baseline models for deploying contrastive learning techniques on specific tasks. We further perform an interpretable analysis of our approach to show how different components of contrastive learning losses play their roles. The experimental results show that our proposed contrastive losses can bring improvement to multi-label text classification tasks. Our work also explores how contrastive learning should be adapted for multi-label text classification tasks.

2020

pdf bib
Manifold Learning-based Word Representation Refinement Incorporating Global and Local Information
Wenyu Zhao | Dong Zhou | Lin Li | Jinjun Chen
Proceedings of the 28th International Conference on Computational Linguistics

Recent studies show that word embedding models often underestimate similarities between similar words and overestimate similarities between distant words. This results in word similarity results obtained from embedding models inconsistent with human judgment. Manifold learning-based methods are widely utilized to refine word representations by re-embedding word vectors from the original embedding space to a new refined semantic space. These methods mainly focus on preserving local geometry information through performing weighted locally linear combination between words and their neighbors twice. However, these reconstruction weights are easily influenced by different selections of neighboring words and the whole combination process is time-consuming. In this paper, we propose two novel word representation refinement methods leveraging isometry feature mapping and local tangent space respectively. Unlike previous methods, our first method corrects pre-trained word embeddings by preserving global geometry information of all words instead of local geometry information between words and their neighbors. Our second method refines word representations by aligning original and re-fined embedding spaces based on local tangent space instead of performing weighted locally linear combination twice. Experimental results obtained from standard semantic relatedness and semantic similarity tasks show that our methods outperform various state-of-the-art baselines for word representation refinement.

2016

pdf bib
Enhanced Personalized Search using Social Data
Dong Zhou | Séamus Lawless | Xuan Wu | Wenyu Zhao | Jianxun Liu
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

2010

pdf bib
Dual-Space Re-ranking Model for Document Retrieval
Dong Zhou | Seamus Lawless | Jinming Min | Vincent Wade
Coling 2010: Posters

2009

pdf bib
Latent Document Re-Ranking
Dong Zhou | Vincent Wade
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing