We Need to Talk About train-dev-test Splits

Rob Van Der Goot

doi:10.18653/v1/2021.emnlp-main.368

We Need to Talk About train-dev-test Splits

Abstract

Standard train-dev-test splits used to benchmark multiple models against each other are ubiquitously used in Natural Language Processing (NLP). In this setup, the train data is used for training the model, the development set for evaluating different versions of the proposed model(s) during development, and the test set to confirm the answers to the main research question(s). However, the introduction of neural networks in NLP has led to a different use of these standard splits; the development set is now often used for model selection during the training procedure. Because of this, comparing multiple versions of the same model during development leads to overestimation on the development data. As an effect, people have started to compare an increasing amount of models on the test data, leading to faster overfitting and “expiration” of our test sets. We propose to use a tune-set when developing neural network methods, which can be used for model picking so that comparing the different versions of a new model can safely be done on the development data.

Anthology ID:: 2021.emnlp-main.368
Volume:: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2021
Address:: Online and Punta Cana, Dominican Republic
Editors:: Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4485–4494
Language:
URL:: https://aclanthology.org/2021.emnlp-main.368
DOI:: 10.18653/v1/2021.emnlp-main.368
Bibkey:
Cite (ACL):: Rob van der Goot. 2021. We Need to Talk About train-dev-test Splits. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 4485–4494, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):: We Need to Talk About train-dev-test Splits (van der Goot, EMNLP 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.emnlp-main.368.pdf
Software:: 2021.emnlp-main.368.Software.tgz
Video:: https://aclanthology.org/2021.emnlp-main.368.mp4
Code: robvanderg/tuneset

PDF Cite Search Code Software Video