Reduce Human Labor On Evaluating Conversational Information Retrieval System: A Human-Machine Collaboration Approach

Chen Huang, Peixin Qin, Wenqiang Lei, Jiancheng Lv


Abstract
Evaluating conversational information retrieval (CIR) systems is a challenging task that requires a significant amount of human labor for annotation. It is imperative to invest significant effort into researching more labor-effective methods for evaluating CIR systems. To touch upon this challenge, we take the first step to involve active testing in CIR evaluation and propose a novel method, called HomCoE. It strategically selects a few data for human annotation, then calibrates the evaluation results to eliminate evaluation biases. As such, it makes an accurate evaluation of the CIR system at low human labor. We experimentally reveal that it consumes less than 1% of human labor and achieves a consistency rate of 95%-99% with human evaluation results. This emphasizes the superiority of our method over other baselines.
Anthology ID:
2023.emnlp-main.670
Volume:
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10876–10891
Language:
URL:
https://aclanthology.org/2023.emnlp-main.670
DOI:
10.18653/v1/2023.emnlp-main.670
Bibkey:
Cite (ACL):
Chen Huang, Peixin Qin, Wenqiang Lei, and Jiancheng Lv. 2023. Reduce Human Labor On Evaluating Conversational Information Retrieval System: A Human-Machine Collaboration Approach. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 10876–10891, Singapore. Association for Computational Linguistics.
Cite (Informal):
Reduce Human Labor On Evaluating Conversational Information Retrieval System: A Human-Machine Collaboration Approach (Huang et al., EMNLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.emnlp-main.670.pdf
Video:
 https://aclanthology.org/2023.emnlp-main.670.mp4