Do sequence-to-sequence VAEs learn global features of sentences?

Tom Bosc; Pascal Vincent

doi:10.18653/v1/2020.emnlp-main.350

Do sequence-to-sequence VAEs learn global features of sentences?

Abstract

Autoregressive language models are powerful and relatively easy to train. However, these models are usually trained without explicit conditioning labels and do not offer easy ways to control global aspects such as sentiment or topic during generation. Bowman & al. 2016 adapted the Variational Autoencoder (VAE) for natural language with the sequence-to-sequence architecture and claimed that the latent vector was able to capture such global features in an unsupervised manner. We question this claim. We measure which words benefit most from the latent information by decomposing the reconstruction loss per position in the sentence. Using this method, we find that VAEs are prone to memorizing the first words and the sentence length, producing local features of limited usefulness. To alleviate this, we investigate alternative architectures based on bag-of-words assumptions and language model pretraining. These variants learn latent variables that are more global, i.e., more predictive of topic or sentiment labels. Moreover, using reconstructions, we observe that they decrease memorization: the first word and the sentence length are not recovered as accurately than with the baselines, consequently yielding more diverse reconstructions.

Anthology ID:: 2020.emnlp-main.350
Volume:: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Month:: November
Year:: 2020
Address:: Online
Editors:: Bonnie Webber, Trevor Cohn, Yulan He, Yang Liu
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4296–4318
Language:
URL:: https://aclanthology.org/2020.emnlp-main.350/
DOI:: 10.18653/v1/2020.emnlp-main.350
Bibkey:
Cite (ACL):: Tom Bosc and Pascal Vincent. 2020. Do sequence-to-sequence VAEs learn global features of sentences?. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4296–4318, Online. Association for Computational Linguistics.
Cite (Informal):: Do sequence-to-sequence VAEs learn global features of sentences? (Bosc & Vincent, EMNLP 2020)
Copy Citation:
PDF:: https://aclanthology.org/2020.emnlp-main.350.pdf
Video:: https://slideslive.com/38939119

PDF Cite Search Video Fix data