A Universal Discriminator for Zero-Shot Generalization

Haike Xu, Zongyu Lin, Jing Zhou, Yanan Zheng, Zhilin Yang


Abstract
Generative modeling has been the dominant approach for large-scale pretraining and zero-shot generalization. In this work, we challenge this convention by showing that discriminative approaches perform substantially better than generative ones on a large number of NLP tasks. Technically, we train a single discriminator to predict whether a text sample comes from the true data distribution, similar to GANs. Since many NLP tasks can be formulated as selecting from a few options, we use this discriminator to predict the concatenation of input and which option has the highest probability of coming from the true data distribution. This simple formulation achieves state-of-the-art zero-shot results on the T0 benchmark, outperforming T0 by 16.0%, 7.8%, and 11.5% respectively on different scales. In the finetuning setting, our approach also achieves new state-of-the-art results on a wide range of NLP tasks, with only 1/4 parameters of previous methods. Meanwhile, our approach requires minimal prompting efforts, which largely improves robustness and is essential for real-world applications. Furthermore, we also jointly train a generalized UD in combination with generative tasks, which maintains its advantage on discriminative tasks and simultaneously works on generative tasks.
Anthology ID:
2023.acl-long.589
Volume:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10559–10575
Language:
URL:
https://aclanthology.org/2023.acl-long.589
DOI:
10.18653/v1/2023.acl-long.589
Bibkey:
Cite (ACL):
Haike Xu, Zongyu Lin, Jing Zhou, Yanan Zheng, and Zhilin Yang. 2023. A Universal Discriminator for Zero-Shot Generalization. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10559–10575, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
A Universal Discriminator for Zero-Shot Generalization (Xu et al., ACL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.acl-long.589.pdf
Video:
 https://aclanthology.org/2023.acl-long.589.mp4