Finding Replicable Human Evaluations via Stable Ranking Probability

Finding Replicable Human Evaluations via Stable Ranking Probability Parker Riley author Daniel Deutsch author George Foster author Viresh Ratnakar author Ali Dabirmoghaddam author Markus Freitag author 2024-06 text Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) Kevin Duh editor Helena Gomez editor Steven Bethard editor Association for Computational Linguistics Mexico City, Mexico conference publication riley-etal-2024-finding 10.18653/v1/2024.naacl-long.275 https://aclanthology.org/2024.naacl-long.275/ 2024-06 4908 4919