Learning to Selectively Learn for Weakly Supervised Paraphrase Generation with Model-based Reinforcement Learning

Haiyan Yin, Dingcheng Li, Ping Li


Abstract
Paraphrase generation is an important language generation task attempting to interpret user intents and systematically generate new phrases of identical meanings to the given ones. However, the effectiveness of paraphrase generation is constrained by the access to the golden labeled data pairs where both the amount and the quality of the training data pairs are restricted. In this paper, we propose a new weakly supervised paraphrase generation approach that extends the success of a recent work that leverages reinforcement learning for effective model training with data selection. While data selection is privileged for the target task which has noisy data, developing a reinforced selective learning regime faces several unresolved challenges. In this paper, we carry on important discussions about the above problem and present a new model that could partially overcome the discussed issues with a model-based planning feature and a reward normalization feature. We perform extensive evaluation on four weakly supervised paraphrase generation tasks where the results show that our method could significantly improve the state-of-the-art performance on the evaluation datasets.
Anthology ID:
2022.naacl-main.99
Volume:
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
July
Year:
2022
Address:
Seattle, United States
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1385–1395
Language:
URL:
https://aclanthology.org/2022.naacl-main.99
DOI:
10.18653/v1/2022.naacl-main.99
Bibkey:
Cite (ACL):
Haiyan Yin, Dingcheng Li, and Ping Li. 2022. Learning to Selectively Learn for Weakly Supervised Paraphrase Generation with Model-based Reinforcement Learning. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1385–1395, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):
Learning to Selectively Learn for Weakly Supervised Paraphrase Generation with Model-based Reinforcement Learning (Yin et al., NAACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.naacl-main.99.pdf
Data
COCO