Generative Pretraining for Paraphrase Evaluation

Jack Weston; Raphael Lenain; Udeepa Meepegama; Emil Fristed

doi:10.18653/v1/2022.acl-long.280

Generative Pretraining for Paraphrase Evaluation

Jack Weston, Raphael Lenain, Udeepa Meepegama, Emil Fristed

Abstract

We introduce ParaBLEU, a paraphrase representation learning model and evaluation metric for text generation. Unlike previous approaches, ParaBLEU learns to understand paraphrasis using generative conditioning as a pretraining objective. ParaBLEU correlates more strongly with human judgements than existing metrics, obtaining new state-of-the-art results on the 2017 WMT Metrics Shared Task. We show that our model is robust to data scarcity, exceeding previous state-of-the-art performance using only 50% of the available training data and surpassing BLEU, ROUGE and METEOR with only 40 labelled examples. Finally, we demonstrate that ParaBLEU can be used to conditionally generate novel paraphrases from a single demonstration, which we use to confirm our hypothesis that it learns abstract, generalized paraphrase representations.

Anthology ID:: 2022.acl-long.280
Volume:: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: May
Year:: 2022
Address:: Dublin, Ireland
Editors:: Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4052–4073
Language:
URL:: https://aclanthology.org/2022.acl-long.280/
DOI:: 10.18653/v1/2022.acl-long.280
Bibkey:
Cite (ACL):: Jack Weston, Raphael Lenain, Udeepa Meepegama, and Emil Fristed. 2022. Generative Pretraining for Paraphrase Evaluation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4052–4073, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):: Generative Pretraining for Paraphrase Evaluation (Weston et al., ACL 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.acl-long.280.pdf

PDF Cite Search Fix data