Generating Labeled Data for Relation Extraction: A Meta Learning Approach with Joint GPT-2 Training

Amir Pouran Ben Veyseh; Franck Dernoncourt; Bonan Min; Thien Nguyen

doi:10.18653/v1/2023.findings-acl.727

Generating Labeled Data for Relation Extraction: A Meta Learning Approach with Joint GPT-2 Training

Amir Pouran Ben Veyseh, Franck Dernoncourt, Bonan Min, Thien Nguyen

Abstract

Relation Extraction (RE) is the task of identifying semantic relation between real-world entities mentioned in text. Despite significant progress in RE research, a remaining challenge for RE concerns the lack of training data for data-hungry deep learning models. Cost of annotation and difficulty of the task are among hindrance to collect a large-scale RE dataset in different domains. To address this limitation, we propose a novel framework to automatically generate labeled data for RE. Our framework presents the pre-trained language model GPT-2 for data generation. In addition, to optimize the generated samples for an RE model, we introduce a meta learning approach to allow the GPT-2 model to be updated during the training process for RE. In particular, to leverage the feedback from the RE model to improve the data generation from GPT-2, we propose a novel reward function to update the GPT-2 model with REINFORCE, seeking to promote the similarity of the RE loss function’s gradients computed for generated data and a meta development set. We conduct extensive experiments on two benchmark datasets to produce state-of-the-art performance for RE.

Anthology ID:: 2023.findings-acl.727
Volume:: Findings of the Association for Computational Linguistics: ACL 2023
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 11466–11478
Language:
URL:: https://aclanthology.org/2023.findings-acl.727
DOI:: 10.18653/v1/2023.findings-acl.727
Bibkey:
Cite (ACL):: Amir Pouran Ben Veyseh, Franck Dernoncourt, Bonan Min, and Thien Nguyen. 2023. Generating Labeled Data for Relation Extraction: A Meta Learning Approach with Joint GPT-2 Training. In Findings of the Association for Computational Linguistics: ACL 2023, pages 11466–11478, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: Generating Labeled Data for Relation Extraction: A Meta Learning Approach with Joint GPT-2 Training (Pouran Ben Veyseh et al., Findings 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.findings-acl.727.pdf

PDF Cite Search