WALNUT: A Benchmark on Semi-weakly Supervised Learning for Natural Language Understanding

Guoqing Zheng, Giannis Karamanolakis, Kai Shu, Ahmed Awadallah


Abstract
Building machine learning models for natural language understanding (NLU) tasks relies heavily on labeled data. Weak supervision has been proven valuable when large amount of labeled data is unavailable or expensive to obtain. Existing works studying weak supervision for NLU either mostly focus on a specific task or simulate weak supervision signals from ground-truth labels. It is thus hard to compare different approaches and evaluate the benefit of weak supervision without access to a unified and systematic benchmark with diverse tasks and real-world weak labeling rules. In this paper, we propose such a benchmark, named WALNUT, to advocate and facilitate research on weak supervision for NLU. WALNUT consists of NLU tasks with different types, including document-level and token-level prediction tasks. WALNUT is the first semi-weakly supervised learning benchmark for NLU, where each task contains weak labels generated by multiple real-world weak sources, together with a small set of clean labels. We conduct baseline evaluations on WALNUT to systematically evaluate the effectiveness of various weak supervision methods and model architectures. Our results demonstrate the benefit of weak supervision for low-resource NLU tasks and highlight interesting patterns across tasks. We expect WALNUT to stimulate further research on methodologies to leverage weak supervision more effectively. The benchmark and code for baselines are available at aka.ms/walnut_benchmark.
Anthology ID:
2022.naacl-main.64
Volume:
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
July
Year:
2022
Address:
Seattle, United States
Editors:
Marine Carpuat, Marie-Catherine de Marneffe, Ivan Vladimir Meza Ruiz
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
873–899
Language:
URL:
https://aclanthology.org/2022.naacl-main.64
DOI:
10.18653/v1/2022.naacl-main.64
Bibkey:
Cite (ACL):
Guoqing Zheng, Giannis Karamanolakis, Kai Shu, and Ahmed Awadallah. 2022. WALNUT: A Benchmark on Semi-weakly Supervised Learning for Natural Language Understanding. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 873–899, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):
WALNUT: A Benchmark on Semi-weakly Supervised Learning for Natural Language Understanding (Zheng et al., NAACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.naacl-main.64.pdf
Video:
 https://aclanthology.org/2022.naacl-main.64.mp4
Data
AG NewsGLUEIMDb Movie ReviewsSuperGLUE