RAIDEN Benchmark: Evaluating Role-playing Conversational Agents with Measurement-Driven Custom Dialogues

RAIDEN Benchmark: Evaluating Role-playing Conversational Agents with Measurement-Driven Custom Dialogues Bowen Wu author Kaili Sun author Ziwei Bai author Ying Li author Baoxun Wang author 2025-01 text Proceedings of the 31st International Conference on Computational Linguistics Owen Rambow editor Leo Wanner editor Marianna Apidianaki editor Hend Al-Khalifa editor Barbara Di Eugenio editor Steven Schockaert editor Association for Computational Linguistics Abu Dhabi, UAE conference publication wu-etal-2025-raiden https://aclanthology.org/2025.coling-main.735/ 2025-01 11086 11106