DivGAN: Towards Diverse Paraphrase Generation via Diversified Generative Adversarial Network

Yue Cao, Xiaojun Wan


Abstract
Paraphrases refer to texts that convey the same meaning with different expression forms. Traditional seq2seq-based models on paraphrase generation mainly focus on the fidelity while ignoring the diversity of outputs. In this paper, we propose a deep generative model to generate diverse paraphrases. We build our model based on the conditional generative adversarial network, and propose to incorporate a simple yet effective diversity loss term into the model in order to improve the diversity of outputs. The proposed diversity loss maximizes the ratio of pairwise distance between the generated texts and their corresponding latent codes, forcing the generator to focus more on the latent codes and produce diverse samples. Experimental results on benchmarks of paraphrase generation show that our proposed model can generate more diverse paraphrases compared with baselines.
Anthology ID:
2020.findings-emnlp.218
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2020
Month:
November
Year:
2020
Address:
Online
Editors:
Trevor Cohn, Yulan He, Yang Liu
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2411–2421
Language:
URL:
https://aclanthology.org/2020.findings-emnlp.218
DOI:
10.18653/v1/2020.findings-emnlp.218
Bibkey:
Cite (ACL):
Yue Cao and Xiaojun Wan. 2020. DivGAN: Towards Diverse Paraphrase Generation via Diversified Generative Adversarial Network. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 2411–2421, Online. Association for Computational Linguistics.
Cite (Informal):
DivGAN: Towards Diverse Paraphrase Generation via Diversified Generative Adversarial Network (Cao & Wan, Findings 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.findings-emnlp.218.pdf
Data
MS COCO