Controllable Text Generation with Residual Memory Transformer

Hanqing Zhang, Si Sun, Haiming Wu, Dawei Song


Abstract
Large-scale Causal Language Models (CLMs), e.g., GPT3 and ChatGPT, have brought great success in text generation. However, it is still an open challenge to effectively control the generation process of a CLM while balancing the flexibility, control granularity, and generation efficiency. In this paper, we provide a new alternative for controllable text generation (CTG), by designing a non-intrusive, lightweight control plugin, namely Residual Memory Transformer (RMT), to accompany the generation of CLM at arbitrary time steps. With an encoder-decoder setup, RMT can accept any types of control conditions and cooperate with the base CLM through a residual learning paradigm, to achieve a more flexible, general, and efficient CTG. Extensive experiments are carried out on various control tasks, in the form of both automatic and human evaluations. The results demonstrate the superiority of RMT over a wide range of state-of-the-art CTG approaches. The code implementation of our work is available at: https://github.com/Residual_Memory_Transformer.
Anthology ID:
2024.findings-acl.62
Volume:
Findings of the Association for Computational Linguistics ACL 2024
Month:
August
Year:
2024
Address:
Bangkok, Thailand and virtual meeting
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1048–1066
Language:
URL:
https://aclanthology.org/2024.findings-acl.62
DOI:
Bibkey:
Cite (ACL):
Hanqing Zhang, Si Sun, Haiming Wu, and Dawei Song. 2024. Controllable Text Generation with Residual Memory Transformer. In Findings of the Association for Computational Linguistics ACL 2024, pages 1048–1066, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
Cite (Informal):
Controllable Text Generation with Residual Memory Transformer (Zhang et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-acl.62.pdf