2024
pdf
bib
abs
Adversarial Preference Optimization: Enhancing Your Alignment via RM-LLM Game
Pengyu Cheng
|
Yifan Yang
|
Jian Li
|
Yong Dai
|
Tianhao Hu
|
Peixin Cao
|
Nan Du
|
Xiaolong Li
Findings of the Association for Computational Linguistics: ACL 2024
Human preference alignment is essential to improve the interaction quality of large language models (LLMs). Existing alignment methods depend on manually annotated preference data to guide the LLM optimization directions. However, continuously updating LLMs for alignment raises a distribution gap between model-generated samples and human-annotated responses, hindering training effectiveness. To mitigate this issue, previous methods require additional preference annotation on newly generated samples to adapt to the shifted distribution, which consumes a large amount of annotation resources. Targeting more efficient human preference optimization, we propose an Adversarial Preference Optimization (APO) framework, in which the LLM and the reward model update alternatively via a min-max game. Through adversarial training, the reward model can adapt to the shifted generation distribution of the LLM without any additional annotation. With comprehensive experiments, we find the proposed adversarial training framework further enhances existing alignment baselines in terms of LLM helpfulness and harmlessness. The code is at https://github.com/Linear95/APO.
pdf
bib
abs
On Diversified Preferences of Large Language Model Alignment
Dun Zeng
|
Yong Dai
|
Pengyu Cheng
|
Longyue Wang
|
Tianhao Hu
|
Wanshun Chen
|
Nan Du
|
Zenglin Xu
Findings of the Association for Computational Linguistics: EMNLP 2024
Aligning large language models (LLMs) with human preferences has been recognized as the key to improving LLMs’ interaction quality. However, in this pluralistic world, human preferences can be diversified due to annotators’ different tastes, which hinders the effectiveness of LLM alignment methods. This paper presents the first quantitative analysis of the experimental scaling law for reward models with varying sizes, from 1.3 billion to 7 billion parameters, trained with human feedback exhibiting diverse preferences. Our analysis reveals that the impact of diversified human preferences depends on both model size and data size. Larger models with sufficient capacity mitigate the negative effects of diverse preferences, while smaller models struggle to accommodate them. To mitigate the impact of diverse preferences, we introduce a new metric, Expected Calibration Error (ECE), to evaluate RMs and show their obvious positive correlation with the alignment performance of LLMs. Furthermore, we propose a Multi-Objective Reward learning method (MORE) to enhance the calibration performance of RMs on shared preferences. Through experiments on four models and five human preference datasets, we find the calibration error can be adopted as a key metric for evaluating RMs and MORE can obtain superior alignment performance.
pdf
bib
abs
Chunk, Align, Select: A Simple Long-sequence Processing Method for Transformers
Jiawen Xie
|
Pengyu Cheng
|
Xiao Liang
|
Yong Dai
|
Nan Du
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Although dominant in natural language processing, transformer-based models still struggle with long-sequence processing, due to the computational costs of their self-attention operations, which increase exponentially as the length of the input sequence grows. To address this challenge, we propose a **Sim**ple framework to enhance the long-content processing of off-the-shelf pre-trained transformers via three steps: **C**hunk, **A**lign, and **S**elect (SimCAS). More specifically, we first divide each long-sequence input into a batch of chunks, then align the inter-chunk information during the encoding steps, and finally, select the most representative hidden states from the encoder for the decoding process. With our SimCAS, the computation and memory costs can be reduced to linear complexity. In experiments, we demonstrate the effectiveness of the proposed method on various real-world long-text summarization and reading comprehension tasks, in which SimCAS significantly outperforms prior long-sequence processing baselines. The code is at [https://github.com/xjw-nlp/SimCAS](https://github.com/xjw-nlp/SimCAS).
2020
pdf
bib
abs
Improving Disentangled Text Representation Learning with Information-Theoretic Guidance
Pengyu Cheng
|
Martin Renqiang Min
|
Dinghan Shen
|
Christopher Malon
|
Yizhe Zhang
|
Yitong Li
|
Lawrence Carin
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Learning disentangled representations of natural language is essential for many NLP tasks, e.g., conditional text generation, style transfer, personalized dialogue systems, etc. Similar problems have been studied extensively for other forms of data, such as images and videos. However, the discrete nature of natural language makes the disentangling of textual representations more challenging (e.g., the manipulation over the data space cannot be easily achieved). Inspired by information theory, we propose a novel method that effectively manifests disentangled representations of text, without any supervision on semantics. A new mutual information upper bound is derived and leveraged to measure dependence between style and content. By minimizing this upper bound, the proposed method induces style and content embeddings into two independent low-dimensional spaces. Experiments on both conditional text generation and text-style transfer demonstrate the high quality of our disentangled representation in terms of content and style preservation.
2019
pdf
bib
abs
Learning Compressed Sentence Representations for On-Device Text Processing
Dinghan Shen
|
Pengyu Cheng
|
Dhanasekar Sundararaman
|
Xinyuan Zhang
|
Qian Yang
|
Meng Tang
|
Asli Celikyilmaz
|
Lawrence Carin
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Vector representations of sentences, trained on massive text corpora, are widely used as generic sentence embeddings across a variety of NLP problems. The learned representations are generally assumed to be continuous and real-valued, giving rise to a large memory footprint and slow retrieval speed, which hinders their applicability to low-resource (memory and computation) platforms, such as mobile devices. In this paper, we propose four different strategies to transform continuous and generic sentence embeddings into a binarized form, while preserving their rich semantic information. The introduced methods are evaluated across a wide range of downstream tasks, where the binarized sentence embeddings are demonstrated to degrade performance by only about 2% relative to their continuous counterparts, while reducing the storage requirement by over 98%. Moreover, with the learned binary representations, the semantic relatedness of two sentences can be evaluated by simply calculating their Hamming distance, which is more computational efficient compared with the inner product operation between continuous embeddings. Detailed analysis and case study further validate the effectiveness of proposed methods.
pdf
bib
abs
Improving Textual Network Embedding with Global Attention via Optimal Transport
Liqun Chen
|
Guoyin Wang
|
Chenyang Tao
|
Dinghan Shen
|
Pengyu Cheng
|
Xinyuan Zhang
|
Wenlin Wang
|
Yizhe Zhang
|
Lawrence Carin
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Constituting highly informative network embeddings is an essential tool for network analysis. It encodes network topology, along with other useful side information, into low dimensional node-based feature representations that can be exploited by statistical modeling. This work focuses on learning context-aware network embeddings augmented with text data. We reformulate the network embedding problem, and present two novel strategies to improve over traditional attention mechanisms: (i) a content-aware sparse attention module based on optimal transport; and (ii) a high-level attention parsing module. Our approach yields naturally sparse and self-normalized relational inference. It can capture long-term interactions between sequences, thus addressing the challenges faced by existing textual network embedding schemes. Extensive experiments are conducted to demonstrate our model can consistently outperform alternative state-of-the-art methods.