Generative Pretraining for Paraphrase Evaluation

Jack Weston, Raphael Lenain, Udeepa Meepegama, Emil Fristed


Abstract
We introduce ParaBLEU, a paraphrase representation learning model and evaluation metric for text generation. Unlike previous approaches, ParaBLEU learns to understand paraphrasis using generative conditioning as a pretraining objective. ParaBLEU correlates more strongly with human judgements than existing metrics, obtaining new state-of-the-art results on the 2017 WMT Metrics Shared Task. We show that our model is robust to data scarcity, exceeding previous state-of-the-art performance using only 50% of the available training data and surpassing BLEU, ROUGE and METEOR with only 40 labelled examples. Finally, we demonstrate that ParaBLEU can be used to conditionally generate novel paraphrases from a single demonstration, which we use to confirm our hypothesis that it learns abstract, generalized paraphrase representations.
Anthology ID:
2022.acl-long.280
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4052–4073
Language:
URL:
https://aclanthology.org/2022.acl-long.280
DOI:
10.18653/v1/2022.acl-long.280
Bibkey:
Cite (ACL):
Jack Weston, Raphael Lenain, Udeepa Meepegama, and Emil Fristed. 2022. Generative Pretraining for Paraphrase Evaluation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4052–4073, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Generative Pretraining for Paraphrase Evaluation (Weston et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.280.pdf
Data
GLUEMRPCMultiNLIPARANMT-50MPAWSSNLI