Prompt-Based Rule Discovery and Boosting for Interactive Weakly-Supervised Learning

Rongzhi Zhang, Yue Yu, Pranav Shetty, Le Song, Chao Zhang


Abstract
Weakly-supervised learning (WSL) has shown promising results in addressing label scarcity on many NLP tasks, but manually designing a comprehensive, high-quality labeling rule set is tedious and difficult. We study interactive weakly-supervised learning—the problem of iteratively and automatically discovering novel labeling rules from data to improve the WSL model. Our proposed model, named PRBoost, achieves this goal via iterative prompt-based rule discovery and model boosting. It uses boosting to identify large-error instances and discovers candidate rules from them by prompting pre-trained LMs with rule templates. The candidate rules are judged by human experts, and the accepted rules are used to generate complementary weak labels and strengthen the current model. Experiments on four tasks show PRBoost outperforms state-of-the-art WSL baselines up to 7.1%, and bridges the gaps with fully supervised models.
Anthology ID:
2022.acl-long.55
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
745–758
Language:
URL:
https://aclanthology.org/2022.acl-long.55
DOI:
10.18653/v1/2022.acl-long.55
Bibkey:
Cite (ACL):
Rongzhi Zhang, Yue Yu, Pranav Shetty, Le Song, and Chao Zhang. 2022. Prompt-Based Rule Discovery and Boosting for Interactive Weakly-Supervised Learning. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 745–758, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Prompt-Based Rule Discovery and Boosting for Interactive Weakly-Supervised Learning (Zhang et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.55.pdf
Code
 rz-zhang/prboost
Data
AG News