On the data requirements of probing

Zining Zhu, Jixuan Wang, Bai Li, Frank Rudzicz


Abstract
As large and powerful neural language models are developed, researchers have been increasingly interested in developing diagnostic tools to probe them. There are many papers with conclusions of the form “observation X is found in model Y”, using their own datasets with varying sizes. Larger probing datasets bring more reliability, but are also expensive to collect. There is yet to be a quantitative method for estimating reasonable probing dataset sizes. We tackle this omission in the context of comparing two probing configurations: after we have collected a small dataset from a pilot study, how many additional data samples are sufficient to distinguish two different configurations? We present a novel method to estimate the required number of data samples in such experiments and, across several case studies, we verify that our estimations have sufficient statistical power. Our framework helps to systematically construct probing datasets to diagnose neural NLP models.
Anthology ID:
2022.findings-acl.326
Volume:
Findings of the Association for Computational Linguistics: ACL 2022
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4132–4147
Language:
URL:
https://aclanthology.org/2022.findings-acl.326
DOI:
10.18653/v1/2022.findings-acl.326
Bibkey:
Cite (ACL):
Zining Zhu, Jixuan Wang, Bai Li, and Frank Rudzicz. 2022. On the data requirements of probing. In Findings of the Association for Computational Linguistics: ACL 2022, pages 4132–4147, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
On the data requirements of probing (Zhu et al., Findings 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.findings-acl.326.pdf
Software:
 2022.findings-acl.326.software.zip
Video:
 https://aclanthology.org/2022.findings-acl.326.mp4
Code
 spoclab-ca/probing_dataset
Data
SentEval