GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question Answering

GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question Answering Sacha Muller author Antonio Loison author Bilel Omrani author Gautier Viaud author 2025-01 text Proceedings of the 31st International Conference on Computational Linguistics Owen Rambow editor Leo Wanner editor Marianna Apidianaki editor Hend Al-Khalifa editor Barbara Di Eugenio editor Steven Schockaert editor Association for Computational Linguistics Abu Dhabi, UAE conference publication muller-etal-2025-grouse https://aclanthology.org/2025.coling-main.304/ 2025-01 4510 4534