Jointly Improving Language Understanding and Generation with Quality-Weighted Weak Supervision of Automatic Labeling

Ernie Chang; Vera Demberg; Alex Marin

doi:10.18653/v1/2021.eacl-main.69

Jointly Improving Language Understanding and Generation with Quality-Weighted Weak Supervision of Automatic Labeling

Abstract

Neural natural language generation (NLG) and understanding (NLU) models are data-hungry and require massive amounts of annotated data to be competitive. Recent frameworks address this bottleneck with generative models that synthesize weak labels at scale, where a small amount of training labels are expert-curated and the rest of the data is automatically annotated. We follow that approach, by automatically constructing a large-scale weakly-labeled data with a fine-tuned GPT-2, and employ a semi-supervised framework to jointly train the NLG and NLU models. The proposed framework adapts the parameter updates to the models according to the estimated label-quality. On both the E2E and Weather benchmarks, we show that this weakly supervised training paradigm is an effective approach under low resource scenarios with as little as 10 data instances, and outperforming benchmark systems on both datasets when 100% of the training data is used.

Anthology ID:: 2021.eacl-main.69
Volume:: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Month:: April
Year:: 2021
Address:: Online
Editors:: Paola Merlo, Jorg Tiedemann, Reut Tsarfaty
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 818–829
Language:
URL:: https://aclanthology.org/2021.eacl-main.69
DOI:: 10.18653/v1/2021.eacl-main.69
Bibkey:
Cite (ACL):: Ernie Chang, Vera Demberg, and Alex Marin. 2021. Jointly Improving Language Understanding and Generation with Quality-Weighted Weak Supervision of Automatic Labeling. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 818–829, Online. Association for Computational Linguistics.
Cite (Informal):: Jointly Improving Language Understanding and Generation with Quality-Weighted Weak Supervision of Automatic Labeling (Chang et al., EACL 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.eacl-main.69.pdf

PDF Cite Search