Non-Repeatable Experiments and Non-Reproducible Results: The Reproducibility Crisis in Human Evaluation in NLP Anya Belz author Craig Thomson author Ehud Reiter author Simon Mille author 2023-07 text Findings of the Association for Computational Linguistics: ACL 2023 Anna Rogers editor Jordan Boyd-Graber editor Naoaki Okazaki editor Association for Computational Linguistics Toronto, Canada conference publication belz-etal-2023-non 10.18653/v1/2023.findings-acl.226 https://aclanthology.org/2023.findings-acl.226/ 2023-07 3676 3687