PersonalityDBench: A Dataset for Personality Disorders - from Modeling to Controlled Generation

Federico Ravenda; Seyed Ali Bahrainian; Daniele Montagnani; Antonietta Mira; Andrea Raballo

PersonalityDBench: A Dataset for Personality Disorders - from Modeling to Controlled Generation

Federico Ravenda, Seyed Ali Bahrainian, Daniele Montagnani, Antonietta Mira, Andrea Raballo

Abstract

Personality disorders (PDs) are a complex class of mental health (MH) conditions characterized by persistent patterns of cognition, behavior, and emotional regulation that deviate from cultural norms. While social media has become a valuable resource for MH research, NLP has largely focused on more prevalent conditions (e.g., depression), leaving PDs underexplored. In this work, we introduce PersonalityDBench, a large-scale, clinically grounded dataset that supports multidimensional study of personality pathology, and standardized, reproducible evaluation of LLM steering toward clinically grounded behavioral targets. The dataset comprises two parts: (1) PRISMA and (2) PersonaDSteering. (1) PRISMA (PeRsonality dISorder MAnifestations) is a clinically annotated collection of social media content spanning the full spectrum of PDs. It links clinically validated diagnostic criteria and dimensional trait frameworks with computational annotation and analysis methods to support fine-grained, multidimensional study of how PDs manifests in naturalistic, free-form language. Building on PRISMA, (2) PersonaDSteering is a benchmark for LLM steering evaluation that operationalizes clinically grounded PD profiles into structured behavioral elicitation tasks, enabling multidimensional steerability assessment beyond single-behavior settings and supporting PD-consistent persona construction for simulated patient generation. This dataset may have application in the study and modeling of PD and powering personality-specific text generation for adaptive, personalized chat systems.

Anthology ID:: 2026.acl-long.1395
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 30239–30259
Language:
URL:: https://aclanthology.org/2026.acl-long.1395/
DOI:
Bibkey:
Cite (ACL):: Federico Ravenda, Seyed Ali Bahrainian, Daniele Montagnani, Antonietta Mira, and Andrea Raballo. 2026. PersonalityDBench: A Dataset for Personality Disorders - from Modeling to Controlled Generation. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 30239–30259, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: PersonalityDBench: A Dataset for Personality Disorders - from Modeling to Controlled Generation (Ravenda et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-long.1395.pdf
Checklist:: 2026.acl-long.1395.checklist.pdf

PDF Cite Search Checklist Fix data