The 2023 ReproNLP Shared Task on Reproducibility of Evaluations in NLP: Overview and Results

Anya Belz, Craig Thomson


Abstract
This paper presents an overview of, and the results from, the 2023 Shared Task on Reproducibility of Evaluations in NLP (ReproNLP’23), following on from two previous shared tasks on reproducibility of evaluations in NLG, ReproGen’21 and ReproGen’22. This shared task series forms part of an ongoing research programme designed to develop theory and practice of reproducibility assessment in NLP and machine learning, all against a background of an interest in reproducibility that con- tinues to grow in the two fields. This paper describes the ReproNLP’23 shared task, summarises results from the reproduction studies submitted, and provides comparative analysis of the results.
Anthology ID:
2023.humeval-1.4
Volume:
Proceedings of the 3rd Workshop on Human Evaluation of NLP Systems
Month:
September
Year:
2023
Address:
Varna, Bulgaria
Editors:
Anya Belz, Maja Popović, Ehud Reiter, Craig Thomson, João Sedoc
Venues:
HumEval | WS
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
35–48
Language:
URL:
https://aclanthology.org/2023.humeval-1.4
DOI:
Bibkey:
Cite (ACL):
Anya Belz and Craig Thomson. 2023. The 2023 ReproNLP Shared Task on Reproducibility of Evaluations in NLP: Overview and Results. In Proceedings of the 3rd Workshop on Human Evaluation of NLP Systems, pages 35–48, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
The 2023 ReproNLP Shared Task on Reproducibility of Evaluations in NLP: Overview and Results (Belz & Thomson, HumEval-WS 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.humeval-1.4.pdf