Zhi-Hong Deng

Also published as: Zhihong Deng


2024

pdf bib
Retrieved Sequence Augmentation for Protein Representation Learning
Chang Ma | Haiteng Zhao | Lin Zheng | Jiayi Xin | Qintong Li | Lijun Wu | Zhihong Deng | Yang Young Lu | Qi Liu | Sheng Wang | Lingpeng Kong
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Protein Language Models traditionally depend on Multiple Sequence Alignments (MSA) to incorporate evolutionary knowledge. However, MSA-based approaches suffer from substantial computational overhead and generally underperform in generalizing to de novo proteins. This study reevaluates the role of MSA, proposing it as a retrieval augmentation method and questioning the necessity of sequence alignment. We show that a simple alternative, Retrieved Sequence Augmentation (RSA), can enhance protein representation learning without the need for alignment and cumbersome preprocessing. RSA surpasses MSA Transformer by an average of 5% in both structural and property prediction tasks while being 373 times faster. Additionally, RSA demonstrates enhanced transferability for predicting de novo proteins. This methodology addresses a critical need for efficiency in protein prediction and can be rapidly employed to identify homologous sequences, improve representation learning, and enhance the capacity of Large Language Models to interpret protein structures.

2023

pdf bib
Dual-Alignment Pre-training for Cross-lingual Sentence Embedding
Ziheng Li | Shaohan Huang | Zihan Zhang | Zhi-Hong Deng | Qiang Lou | Haizhen Huang | Jian Jiao | Furu Wei | Weiwei Deng | Qi Zhang
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Recent studies have shown that dual encoder models trained with the sentence-level translation ranking task are effective methods for cross-lingual sentence embedding. However, our research indicates that token-level alignment is also crucial in multilingual scenarios, which has not been fully explored previously. Based on our findings, we propose a dual-alignment pre-training (DAP) framework for cross-lingual sentence embedding that incorporates both sentence-level and token-level alignment. To achieve this, we introduce a novel representation translation learning (RTL) task, where the model learns to use one-side contextualized token representation to reconstruct its translation counterpart. This reconstruction objective encourages the model to embed translation information into the token representation. Compared to other token-level alignment methods such as translation language modeling, RTL is more suitable for dual encoder architectures and is computationally efficient. Extensive experiments on three sentence-level cross-lingual benchmarks demonstrate that our approach can significantly improve sentence embedding. Our code is available at https://github.com/ChillingDream/DAP.

pdf bib
Prefix-Tuning Based Unsupervised Text Style Transfer
Huiyu Mai | Wenhao Jiang | Zhi-Hong Deng
Findings of the Association for Computational Linguistics: EMNLP 2023

Unsupervised text style transfer aims at training a generative model that can alter the style of the input sentence while preserving its content without using any parallel data. In this paper, we employ powerful pre-trained large language models and present a new prefix-tuning-based method for unsupervised text style transfer. We construct three different kinds of prefixes, i.e., shared prefix, style prefix, and content prefix, to encode task-specific information, target style, and the content information of the input sentence, respectively. Compared to embeddings used by previous works, the proposed prefixes can provide richer information for the model. Furthermore, we adopt a recursive way of using language models in the process of style transfer. This strategy provides a more effective way for the interactions between the input sentence and GPT-2, helps the model construct more informative prefixes, and thus, helps improve the performance. Evaluations on the well-known datasets show that our method outperforms the state-of-the-art baselines. Results, analysis of ablation studies, and subjective evaluations from humans are also provided for a deeper understanding of the proposed method.

2018

pdf bib
MEMD: A Diversity-Promoting Learning Framework for Short-Text Conversation
Meng Zou | Xihan Li | Haokun Liu | Zhihong Deng
Proceedings of the 27th International Conference on Computational Linguistics

Neural encoder-decoder models have been widely applied to conversational response generation, which is a research hot spot in recent years. However, conventional neural encoder-decoder models tend to generate commonplace responses like “I don’t know” regardless of what the input is. In this paper, we analyze this problem from a new perspective: latent vectors. Based on it, we propose an easy-to-extend learning framework named MEMD (Multi-Encoder to Multi-Decoder), in which an auxiliary encoder and an auxiliary decoder are introduced to provide necessary training guidance without resorting to extra data or complicating network’s inner structure. Experimental results demonstrate that our method effectively improve the quality of generated responses according to automatic metrics and human evaluations, yielding more diverse and smooth replies.

pdf bib
Unsupervised Neural Word Segmentation for Chinese via Segmental Language Modeling
Zhiqing Sun | Zhi-Hong Deng
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Previous traditional approaches to unsupervised Chinese word segmentation (CWS) can be roughly classified into discriminative and generative models. The former uses the carefully designed goodness measures for candidate segmentation, while the latter focuses on finding the optimal segmentation of the highest generative probability. However, while there exists a trivial way to extend the discriminative models into neural version by using neural language models, those of generative ones are non-trivial. In this paper, we propose the segmental language models (SLMs) for CWS. Our approach explicitly focuses on the segmental nature of Chinese, as well as preserves several properties of language models. In SLMs, a context encoder encodes the previous context and a segment decoder generates each segment incrementally. As far as we know, we are the first to propose a neural model for unsupervised CWS and achieve competitive performance to the state-of-the-art statistical models on four different datasets from SIGHAN 2005 bakeoff.

2017

pdf bib
Inter-Weighted Alignment Network for Sentence Pair Modeling
Gehui Shen | Yunlun Yang | Zhi-Hong Deng
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Sentence pair modeling is a crucial problem in the field of natural language processing. In this paper, we propose a model to measure the similarity of a sentence pair focusing on the interaction information. We utilize the word level similarity matrix to discover fine-grained alignment of two sentences. It should be emphasized that each word in a sentence has a different importance from the perspective of semantic composition, so we exploit two novel and efficient strategies to explicitly calculate a weight for each word. Although the proposed model only use a sequential LSTM for sentence modeling without any external resource such as syntactic parser tree and additional lexicon features, experimental results show that our model achieves state-of-the-art performance on three datasets of two tasks.

2016

pdf bib
A Position Encoding Convolutional Neural Network Based on Dependency Tree for Relation Classification
Yunlun Yang | Yunhai Tong | Shulei Ma | Zhi-Hong Deng
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
An Unsupervised Multi-Document Summarization Framework Based on Neural Document Model
Shulei Ma | Zhi-Hong Deng | Yunlun Yang
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

In the age of information exploding, multi-document summarization is attracting particular attention for the ability to help people get the main ideas in a short time. Traditional extractive methods simply treat the document set as a group of sentences while ignoring the global semantics of the documents. Meanwhile, neural document model is effective on representing the semantic content of documents in low-dimensional vectors. In this paper, we propose a document-level reconstruction framework named DocRebuild, which reconstructs the documents with summary sentences through a neural document model and selects summary sentences to minimize the reconstruction error. We also apply two strategies, sentence filtering and beamsearch, to improve the performance of our method. Experimental results on the benchmark datasets DUC 2006 and DUC 2007 show that DocRebuild is effective and outperforms the state-of-the-art unsupervised algorithms.

2015

pdf bib
JEAM: A Novel Model for Cross-Domain Sentiment Classification Based on Emotion Analysis
Kun-Hu Luo | Zhi-Hong Deng | Hongliang Yu | Liang-Chen Wei
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

2014

pdf bib
A Novel Content Enriching Model for Microblog Using News Corpus
Yunlun Yang | Zhihong Deng | Hongliang Yu
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2013

pdf bib
Identifying Sentiment Words Using an Optimization-based Model without Seed Words
Hongliang Yu | Zhi-Hong Deng | Shiyingxue Li
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)