CHIE: Generative MRC Evaluation for in-context QA with Correctness, Helpfulness, Irrelevancy, and Extraneousness Aspects Wannaphong Phatthiyaphaibun author Surapon Nonesung author Peerat Limkonchotiwat author Can Udomcharoenchaikit author Jitkapat Sawatphol author Ekapol Chuangsuwanich author Sarana Nutanong author 2024-11 text Proceedings of the 2nd GenBench Workshop on Generalisation (Benchmarking) in NLP Dieuwke Hupkes editor Verna Dankers editor Khuyagbaatar Batsuren editor Amirhossein Kazemnejad editor Christos Christodoulopoulos editor Mario Giulianelli editor Ryan Cotterell editor Association for Computational Linguistics Miami, Florida, USA conference publication phatthiyaphaibun-etal-2024-chie 10.18653/v1/2024.genbench-1.10 https://aclanthology.org/2024.genbench-1.10/ 2024-11 154 164