%0 Conference Proceedings %T More Bang for Your Buck: Natural Perturbation for Robust Question Answering %A Khashabi, Daniel %A Khot, Tushar %A Sabharwal, Ashish %Y Webber, Bonnie %Y Cohn, Trevor %Y He, Yulan %Y Liu, Yang %S Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) %D 2020 %8 November %I Association for Computational Linguistics %C Online %F khashabi-etal-2020-bang %X Deep learning models for linguistic tasks require large training datasets, which are expensive to create. As an alternative to the traditional approach of creating new instances by repeating the process of creating one instance, we propose doing so by first collecting a set of seed examples and then applying human-driven natural perturbations (as opposed to rule-based machine perturbations), which often change the gold label as well. Such perturbations have the advantage of being relatively easier (and hence cheaper) to create than writing out completely new examples. Further, they help address the issue that even models achieving human-level scores on NLP datasets are known to be considerably sensitive to small changes in input. To evaluate the idea, we consider a recent question-answering dataset (BOOLQ) and study our approach as a function of the perturbation cost ratio, the relative cost of perturbing an existing question vs. creating a new one from scratch. We find that when natural perturbations are moderately cheaper to create (cost ratio under 60%), it is more effective to use them for training BOOLQ models: such models exhibit 9% higher robustness and 4.5% stronger generalization, while retaining performance on the original BOOLQ dataset. %R 10.18653/v1/2020.emnlp-main.12 %U https://aclanthology.org/2020.emnlp-main.12 %U https://doi.org/10.18653/v1/2020.emnlp-main.12 %P 163-170