Reproducibility in NLP: What Have We Learned from the Checklist?

Ian Magnusson, Noah A. Smith, Jesse Dodge


Abstract
Scientific progress in NLP rests on the reproducibility of researchers’ claims. The *CL conferences created the NLP Reproducibility Checklist in 2020 to be completed by authors at submission to remind them of key information to include. We provide the first analysis of the Checklist by examining 10,405 anonymous responses to it. First, we find evidence of an increase in reporting of information on efficiency, validation performance, summary statistics, and hyperparameters after the Checklist’s introduction. Further, we show acceptance rate grows for submissions with more Yes responses. We find that the 44% of submissions that gather new data are 5% less likely to be accepted than those that did not; the average reviewer-rated reproducibility of these submissions is also 2% lower relative to the rest. We find that only 46% of submissions claim to open-source their code, though submissions that do have 8% higher reproducibility score relative to those that do not, the most for any item. We discuss what can be inferred about the state of reproducibility in NLP, and provide a set of recommendations for future conferences, including: a) allowing submitting code and appendices one week after the deadline, and b) measuring dataset reproducibility by a checklist of data collection practices.
Anthology ID:
2023.findings-acl.809
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
12789–12811
Language:
URL:
https://aclanthology.org/2023.findings-acl.809
DOI:
10.18653/v1/2023.findings-acl.809
Bibkey:
Cite (ACL):
Ian Magnusson, Noah A. Smith, and Jesse Dodge. 2023. Reproducibility in NLP: What Have We Learned from the Checklist?. In Findings of the Association for Computational Linguistics: ACL 2023, pages 12789–12811, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Reproducibility in NLP: What Have We Learned from the Checklist? (Magnusson et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-acl.809.pdf
Video:
 https://aclanthology.org/2023.findings-acl.809.mp4