Can the Transformer Be Used as a Drop-in Replacement for RNNs in Text-Generating GANs?

Kevin Blin; Andrei Kucharavy

Can the Transformer Be Used as a Drop-in Replacement for RNNs in Text-Generating GANs?

Abstract

In this paper we address the problem of fine-tuned text generation with a limited computational budget. For that, we use a well-performing text generative adversarial network (GAN) architecture - Diversity-Promoting GAN (DPGAN), and attempted a drop-in replacement of the LSTM layer with a self-attention-based Transformer layer in order to leverage their efficiency. The resulting Self-Attention DPGAN (SADPGAN) was evaluated for performance, quality and diversity of generated text and stability. Computational experiments suggested that a transformer architecture is unable to drop-in replace the LSTM layer, under-performing during the pre-training phase and undergoing a complete mode collapse during the GAN tuning phase. Our results suggest that the transformer architecture need to be adapted before it can be used as a replacement for RNNs in text-generating GANs.

Anthology ID:: 2021.ranlp-1.21
Volume:: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
Month:: September
Year:: 2021
Address:: Held Online
Editors:: Ruslan Mitkov, Galia Angelova
Venue:: RANLP
SIG:
Publisher:: INCOMA Ltd.
Note:
Pages:: 173–181
Language:
URL:: https://aclanthology.org/2021.ranlp-1.21/
DOI:
Bibkey:
Cite (ACL):: Kevin Blin and Andrei Kucharavy. 2021. Can the Transformer Be Used as a Drop-in Replacement for RNNs in Text-Generating GANs?. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 173–181, Held Online. INCOMA Ltd..
Cite (Informal):: Can the Transformer Be Used as a Drop-in Replacement for RNNs in Text-Generating GANs? (Blin & Kucharavy, RANLP 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.ranlp-1.21.pdf

PDF Cite Search Fix data