Ranking Over Scoring: Towards Reliable and Robust Automated Evaluation of LLM-Generated Medical Explanatory Arguments

Ranking Over Scoring: Towards Reliable and Robust Automated Evaluation of LLM-Generated Medical Explanatory Arguments Iker De la Iglesia author Iakes Goenaga author Johanna Ramirez-Romero author Jose Maria Villa-Gonzalez author Josu Goikoetxea author Ander Barrena author 2025-01 text Proceedings of the 31st International Conference on Computational Linguistics Owen Rambow editor Leo Wanner editor Marianna Apidianaki editor Hend Al-Khalifa editor Barbara Di Eugenio editor Steven Schockaert editor Association for Computational Linguistics Abu Dhabi, UAE conference publication de-la-iglesia-etal-2025-ranking https://aclanthology.org/2025.coling-main.634/ 2025-01 9456 9471