Investigating Selective Prediction Approaches Across Several Tasks in IID, OOD, and Adversarial Settings

Neeraj Varshney, Swaroop Mishra, Chitta Baral


Abstract
In order to equip NLP systems with ‘selective prediction’ capability, several task-specific approaches have been proposed. However, which approaches work best across tasks or even if they consistently outperform the simplest baseline MaxProb remains to be explored. To this end, we systematically study selective prediction in a large-scale setup of 17 datasets across several NLP tasks. Through comprehensive experiments under in-domain (IID), out-of-domain (OOD), and adversarial (ADV) settings, we show that despite leveraging additional resources (held-out data/computation), none of the existing approaches consistently and considerably outperforms MaxProb in all three settings. Furthermore, their performance does not translate well across tasks. For instance, Monte-Carlo Dropout outperforms all other approaches on Duplicate Detection datasets but does not fare well on NLI datasets, especially in the OOD setting. Thus, we recommend that future selective prediction approaches should be evaluated across tasks and settings for reliable estimation of their capabilities.
Anthology ID:
2022.findings-acl.158
Volume:
Findings of the Association for Computational Linguistics: ACL 2022
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1995–2002
Language:
URL:
https://aclanthology.org/2022.findings-acl.158
DOI:
10.18653/v1/2022.findings-acl.158
Bibkey:
Cite (ACL):
Neeraj Varshney, Swaroop Mishra, and Chitta Baral. 2022. Investigating Selective Prediction Approaches Across Several Tasks in IID, OOD, and Adversarial Settings. In Findings of the Association for Computational Linguistics: ACL 2022, pages 1995–2002, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Investigating Selective Prediction Approaches Across Several Tasks in IID, OOD, and Adversarial Settings (Varshney et al., Findings 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.findings-acl.158.pdf
Software:
 2022.findings-acl.158.software.zip
Video:
 https://aclanthology.org/2022.findings-acl.158.mp4
Data
GLUEMultiNLISNLI