Hierarchical Sketch Induction for Paraphrase Generation

Tom Hosking, Hao Tang, Mirella Lapata


Abstract
We propose a generative model of paraphrase generation, that encourages syntactic diversity by conditioning on an explicit syntactic sketch. We introduce Hierarchical Refinement Quantized Variational Autoencoders (HRQ-VAE), a method for learning decompositions of dense encodings as a sequence of discrete latent variables that make iterative refinements of increasing granularity. This hierarchy of codes is learned through end-to-end training, and represents fine-to-coarse grained information about the input. We use HRQ-VAE to encode the syntactic form of an input sentence as a path through the hierarchy, allowing us to more easily predict syntactic sketches at test time. Extensive experiments, including a human evaluation, confirm that HRQ-VAE learns a hierarchical representation of the input space, and generates paraphrases of higher quality than previous systems.
Anthology ID:
2022.acl-long.178
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2489–2501
Language:
URL:
https://aclanthology.org/2022.acl-long.178
DOI:
10.18653/v1/2022.acl-long.178
Bibkey:
Cite (ACL):
Tom Hosking, Hao Tang, and Mirella Lapata. 2022. Hierarchical Sketch Induction for Paraphrase Generation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2489–2501, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Hierarchical Sketch Induction for Paraphrase Generation (Hosking et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.178.pdf
Video:
 https://aclanthology.org/2022.acl-long.178.mp4
Code
 tomhosking/hrq-vae
Data
MS COCOMSCOCOParalexQuora Question Pairs