Pu-Jen Cheng


2024

pdf bib
Plug-in Language Model: Controlling Text Generation with a Simple Regression Model
Nai-Chi Yang | Wei-Yun Ma | Pu-Jen Cheng
Findings of the Association for Computational Linguistics: NAACL 2024

Large-scale pre-trained language models have displayed unrivaled capacity in generating text that closely resembles human-written text. Nevertheless, generating texts adhering to specific conditions without fine-tuning or adding new parameters can be challenging. Contemporary approaches commonly rely on either prompts or auxiliary models to avoid modifying the language models. These auxiliary models are designed to assess whether a generated token contributes to meeting the desired requirements. These approaches adjust the distribution of the next token during the inference phase by leveraging the prediction score of the desired attribute to calculate gradients. However, these auxiliary models typically require the language model’s latent states. This prerequisite challenges integrating various existing black box attribute models or tools. We present the Plug-in Language Model (PiLM) as a solution to address the limitations. PiLM leverages reinforcement learning to utilize black box tools directly, adjusting the latent state to control text generation. However, performing backpropagation during the inference phase is time-consuming for PiLM. By replacing backpropagation with a simple regression model, PiLM can achieve an inference time comparable to that of the original LLM. Experiment results show that our approaches in this paper outperform existing state-of-the-art methods that rely on gradient-based, weighted decoding, or prompt-based methodologies.

2022

pdf bib
R-TeaFor: Regularized Teacher-Forcing for Abstractive Summarization
Guan-Yu Lin | Pu-Jen Cheng
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Teacher-forcing is widely used in training sequence generation models to improve sampling efficiency and to stabilize training. However, teacher-forcing is vulnerable to the exposure bias problem. Previous works have attempted to address exposure bias by modifying the training data to simulate model-generated results. Nevertheless, they do not consider the pairwise relationship between the original training data and the modified ones, which provides more information during training. Hence, we propose Regularized Teacher-Forcing (R-TeaFor) to utilize this relationship for better regularization. Empirically, our experiments show that R-TeaFor outperforms previous summarization state-of-the-art models, and the results can be generalized to different pre-trained models.

2009

pdf bib
Web Mining for Unsupervised Classification
Wei-Yen Day | Chun-Yi Chi | Ruey-Cheng Chen | Pu-Jen Cheng | Pei-Sen Liu
Proceedings of the 21st Conference on Computational Linguistics and Speech Processing

pdf bib
Query Formulation by Selecting Good Terms
Chia-Jung Lee | Yi-Chun Lin | Ruey-Cheng Chen | Pei-Sen Liu | Pu-Jen Cheng
Proceedings of the 21st Conference on Computational Linguistics and Speech Processing

2004

pdf bib
Creating Multilingual Translation Lexicons with Regional Variations Using Web Corpora
Pu-Jen Cheng | Wen-Hsiang Lu | Jei-Wen Teng | Lee-Feng Chien
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)