TUDA-Reproducibility @ ReproGen: Replicability of Human Evaluation of Text-to-Text and Concept-to-Text Generation

Christian Richter, Yanran Chen, Steffen Eger


Abstract
This paper describes our contribution to the Shared Task ReproGen by Belz et al. (2021), which investigates the reproducibility of human evaluations in the context of Natural Language Generation. We selected the paper “Generation of Company descriptions using concept-to-text and text-to-text deep models: data set collection and systems evaluation” (Qader et al., 2018) and aimed to replicate, as closely to the original as possible, the human evaluation and the subsequent comparison between the human judgements and the automatic evaluation metrics. Here, we first outline the text generation task of the paper of Qader et al. (2018). Then, we document how we approached our replication of the paper’s human evaluation. We also discuss the difficulties we encountered and which information was missing. Our replication has medium to strong correlation (0.66 Spearman overall) with the original results of Qader et al. (2018), but due to the missing information about how Qader et al. (2018) compared the human judgements with the metric scores, we have refrained from reproducing this comparison.
Anthology ID:
2021.inlg-1.32
Volume:
Proceedings of the 14th International Conference on Natural Language Generation
Month:
August
Year:
2021
Address:
Aberdeen, Scotland, UK
Editors:
Anya Belz, Angela Fan, Ehud Reiter, Yaji Sripada
Venue:
INLG
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
301–307
Language:
URL:
https://aclanthology.org/2021.inlg-1.32
DOI:
10.18653/v1/2021.inlg-1.32
Bibkey:
Cite (ACL):
Christian Richter, Yanran Chen, and Steffen Eger. 2021. TUDA-Reproducibility @ ReproGen: Replicability of Human Evaluation of Text-to-Text and Concept-to-Text Generation. In Proceedings of the 14th International Conference on Natural Language Generation, pages 301–307, Aberdeen, Scotland, UK. Association for Computational Linguistics.
Cite (Informal):
TUDA-Reproducibility @ ReproGen: Replicability of Human Evaluation of Text-to-Text and Concept-to-Text Generation (Richter et al., INLG 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.inlg-1.32.pdf