Krisztian Balog
2026
ConvApparel: A Benchmark Dataset and Validation Framework for User Simulators in Conversational Recommenders
Ofer Meshi | Krisztian Balog | Sally Goldman | Avi Caciularu | Guy Tennenholtz | Jihwan Jeong | Amir Globerson | Craig Boutilier
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Ofer Meshi | Krisztian Balog | Sally Goldman | Avi Caciularu | Guy Tennenholtz | Jihwan Jeong | Amir Globerson | Craig Boutilier
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
The promise of *LLM-based user simulators* to improve conversational AI is hindered by a critical "realism gap," leading to systems that are optimized for simulated interactions, but may fail to perform well in the real world. We introduce *ConvApparel*, a new dataset of human-AI conversations designed to address this gap. Its unique dual-agent data collection protocol, using both "good" and "bad" recommenders, enables counterfactual validation by capturing a wide spectrum of user experiences, enriched with first-person annotations of user satisfaction.We propose a comprehensive validation framework that combines *statistical alignment*, a *human-likeness score*, and *counterfactual validation* to test for generalization.Our experiments reveal a significant realism gap across all simulators. However, the framework also shows that data-driven simulators outperform a prompted baseline, particularly in counterfactual validation where they adapt more realistically to unseen behaviors, suggesting they embody more robust, if imperfect, user models.
2019
Coached Conversational Preference Elicitation: A Case Study in Understanding Movie Preferences
Filip Radlinski | Krisztian Balog | Bill Byrne | Karthik Krishnamoorthi
Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue
Filip Radlinski | Krisztian Balog | Bill Byrne | Karthik Krishnamoorthi
Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue
Conversational recommendation has recently attracted significant attention. As systems must understand users’ preferences, training them has called for conversational corpora, typically derived from task-oriented conversations. We observe that such corpora often do not reflect how people naturally describe preferences. We present a new approach to obtaining user preferences in dialogue: Coached Conversational Preference Elicitation. It allows collection of natural yet structured conversational preferences. Studying the dialogues in one domain, we present a brief quantitative analysis of how people describe movie preferences at scale. Demonstrating the methodology, we release the CCPE-M dataset to the community with over 500 movie preference dialogues expressing over 10,000 preferences.
2009
A Generative Blog Post Retrieval Model that Uses Query Expansion based on External Collections
Wouter Weerkamp | Krisztian Balog | Maarten de Rijke
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP
Wouter Weerkamp | Krisztian Balog | Maarten de Rijke
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP
2007
UVA: Language Modeling Techniques for Web People Search
Krisztian Balog | Leif Azzopardi | Maarten de Rijke
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)
Krisztian Balog | Leif Azzopardi | Maarten de Rijke
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)