Inoculation by Fine-Tuning: A Method for Analyzing Challenge Datasets

Nelson F. Liu, Roy Schwartz, Noah A. Smith


Abstract
Several datasets have recently been constructed to expose brittleness in models trained on existing benchmarks. While model performance on these challenge datasets is significantly lower compared to the original benchmark, it is unclear what particular weaknesses they reveal. For example, a challenge dataset may be difficult because it targets phenomena that current models cannot capture, or because it simply exploits blind spots in a model’s specific training set. We introduce inoculation by fine-tuning, a new analysis method for studying challenge datasets by exposing models (the metaphorical patient) to a small amount of data from the challenge dataset (a metaphorical pathogen) and assessing how well they can adapt. We apply our method to analyze the NLI “stress tests” (Naik et al., 2018) and the Adversarial SQuAD dataset (Jia and Liang, 2017). We show that after slight exposure, some of these datasets are no longer challenging, while others remain difficult. Our results indicate that failures on challenge datasets may lead to very different conclusions about models, training datasets, and the challenge datasets themselves.
Anthology ID:
N19-1225
Volume:
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Editors:
Jill Burstein, Christy Doran, Thamar Solorio
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2171–2179
Language:
URL:
https://aclanthology.org/N19-1225
DOI:
10.18653/v1/N19-1225
Bibkey:
Cite (ACL):
Nelson F. Liu, Roy Schwartz, and Noah A. Smith. 2019. Inoculation by Fine-Tuning: A Method for Analyzing Challenge Datasets. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 2171–2179, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):
Inoculation by Fine-Tuning: A Method for Analyzing Challenge Datasets (Liu et al., NAACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/N19-1225.pdf
Video:
 https://aclanthology.org/N19-1225.mp4
Data
MultiNLISQuAD