IP-Dialog: Evaluating Implicit Personalization in Dialogue Systems with Synthetic Data

Bo Peng; Zhiheng Wang; Heyang Gong; Chaochao Lu

doi:10.18653/v1/2025.findings-emnlp.923

IP-Dialog: Evaluating Implicit Personalization in Dialogue Systems with Synthetic Data

Bo Peng, Zhiheng Wang, Heyang Gong, Chaochao Lu

Abstract

In modern dialogue systems, the ability to implicitly infer user backgrounds from conversations and leverage this information for personalized assistance is crucial. However, the scarcity of high-quality data remains a fundamental challenge to evaluating and improving this capability. Traditional dataset construction methods are labor-intensive, resource-demanding, and raise privacy concerns. To address these issues, we propose a novel approach for automatic synthetic data generation and introduce the **I**mplicit **P**ersonalized **Dialog**ue (**IP-Dialog**) benchmark along with a training dataset, covering 10 tasks and 12 user attribute types. Additionally, we develop a systematic evaluation framework with four metrics to assess both attribute awareness and reasoning capabilities. We further propose five causal graphs to elucidate models’ reasoning pathways during implicit personalization. Extensive experiments yield insightful observations and prove the reliability of our dataset.

Anthology ID:: 2025.findings-emnlp.923
Original:: 2025.findings-emnlp.923v1
Version 2:: 2025.findings-emnlp.923v2
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 17007–17040
Language:
URL:: https://aclanthology.org/2025.findings-emnlp.923/
DOI:: 10.18653/v1/2025.findings-emnlp.923
Bibkey:
Cite (ACL):: Bo Peng, Zhiheng Wang, Heyang Gong, and Chaochao Lu. 2025. IP-Dialog: Evaluating Implicit Personalization in Dialogue Systems with Synthetic Data. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 17007–17040, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: IP-Dialog: Evaluating Implicit Personalization in Dialogue Systems with Synthetic Data (Peng et al., Findings 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.findings-emnlp.923.pdf
Checklist:: 2025.findings-emnlp.923.checklist.pdf

PDF (v2) PDF (v1) Cite Search Checklist Fix data