Agreement is overrated: A plea for correlation to assess human evaluation reliability Jacopo Amidei author Paul Piwek author Alistair Willis author 2019-oct–nov text Proceedings of the 12th International Conference on Natural Language Generation Kees van Deemter editor Chenghua Lin editor Hiroya Takamura editor Association for Computational Linguistics Tokyo, Japan conference publication amidei-etal-2019-agreement 10.18653/v1/W19-8642 https://aclanthology.org/W19-8642/ 2019-oct–nov 344 354