Decoding the Metrics Maze: Navigating the Landscape of Conversational Question Answering System Evaluation in Procedural Tasks Alexander Frummet author David Elsweiler author 2024-05 text Proceedings of the Fourth Workshop on Human Evaluation of NLP Systems (HumEval) @ LREC-COLING 2024 Simone Balloccu editor Anya Belz editor Rudali Huidrom editor Ehud Reiter editor Joao Sedoc editor Craig Thomson editor ELRA and ICCL Torino, Italia conference publication frummet-elsweiler-2024-decoding https://aclanthology.org/2024.humeval-1.8/ 2024-05 81 90