C2D2 Dataset: A Resource for Analyzing Cognitive Distortions and Its Impact on Mental Health

,


Introduction
Cognitive distortions are irrational thinking patterns that can lead to distorted perceptions of reality (Beck, 1970).For example, the thought "I lost my puppy, and my future will be perpetually filled with sadness and loneliness" exemplifies a cognitive distortion known as "Overgeneralization."In this case, the conclusion implies an excessively broad and permanent state of unhappiness.A more precise and less distorted thought for this situation would be: "The sudden loss of my puppy was incredibly painful, and I struggled to come to terms with it." As depicted in Figure 1, cognitive distortions significantly hinder individuals' perception, trapping them further through continuous self-reinforcement and contributing to the development of mental disorders, including depression, anxiety, and post- traumatic stress disorder (PTSD) (Marton et al., 1993;Muris and Field, 2008;Abel et al., 1989;Strohmeier et al., 2016;Hammen, 1978).Hence, the development of automated tools for detecting cognitive distortions is important, as it assists researchers in early detection cognitive distortions, facilitating timely intervention strategies aimed at enhancing individuals' psychological well-being and happiness.
Previous research demonstrates that computational techniques can effectively detect cognitive distortions from language (Shickel et al., 2020;Simms et al., 2017;Bathina et al., 2021).However, these studies have been conducted using private datasets or datasets with limited and low-quality annotations (Alhaj et al., 2022;Ziems et al., 2022).Despite cognitive psychology acknowledging an association between mental disorders and cognitive distortions, the absence of a reliable and publicly accessible dataset has obstructed the establishment of a credible benchmark and further advancements in this field.
To tackle these challenges, we introduce our C2D2 dataset, a Chinese dataset created to address the shortage of research resources regarding cognitive distortions.It is a publicly accessible resource within this domain that includes 7,500 instances of cognitive distortion thoughts from 450 different scenes.The research of our C2D2 can promote research on cognitive distortion and provide insights into users' mental state.We believe this resource will contribute to mental health research in China, a developing country, particularly in light of increasing social pressure and inadequate support for the mental healthcare system.
Moreover, we conduct various experiments, including the cognitive distortion detection and exploring the relationship between cognitive distortions and mental disorders.We compare the performance of finetuning different pretrained models and large language models using in-context learning on C2D2 dataset.We demonstrate the current models' ability to detect cognitive distortions.Furthermore, our research is not limited to cognitive distortions alone.Inspired by psychology, we use computational methods to investigate the association between cognitive distortions and mental health.We use several datasets on mental health to detect and analyze present cognitive distortions in the online texts of individuals with different disorders using our model.We discover some interesting phenomena and conclusions that have not been previously considered, but could potentially validate the underlying mechanisms of certain disorders.Finally, we develop a simple method to enhance the performance of detecting disorders based on cognitive distortions.The creation of the C2D2 dataset contributes to a better understanding of cognitive distortions and the impact of cognitive distortions on mental health.
• We have developed a publicly accessible Chinese Cognitive Distortion Dataset 1 for the first time, aiming to facilitate the analysis of cognitive distortions and promote the integration of psychology and computational technology.
• We explore the association between cognitive distortions and various mental disorders on social media platforms.To our knowledge, this is the first work that utilizes cognitive distortion to assess users' mental states.
• We attempt to incorporate information about

Related Work
In the following sections, we review studies pertaining to cognitive distortions, covering both psychological research and computational techniques.

Cognitive Distortion in Psychology
Since the emergence of cognitive-behavioral theory, cognitive distortions have been the subject of extensive research (Beck, 1970).This line of study, enriched by various researchers, has cultivated a comprehensive theoretical framework (Beck, 2020(Beck, , 1979;;Burns, 1981).
Although originating in depression research, cognitive distortions have also been linked to a variety of issues such as pathological gambling, anxiety, suicide, and anorexia (Marton et al., 1993;Muris and Field, 2008;Abel et al., 1989;Strohmeier et al., 2016;Fortune and Goodie, 2012).From a sociological perspective, they are hypothesized to correlate with juvenile delinquency and antisocial personality (Nas et al., 2005;Wallinius et al., 2011;Gannon and Polaschek, 2006;Feldman, 2007).Furthermore, another branch of research aims to guide psychologists in detecting and rectifying these distortions, thereby studying their impact on treatment (McClenahan, 2005;Yurica and DiTomasso, 2005).

Cognitive Distortion Detection and Application
In the study of mental health computing, the main focus includes emphasis on emotion analysis and symptom identification (Gkotsis et al., 2016;Shickel et al., 2016).Some researchers conduct research from the perspective of cognition and emotion (Uban et al., 2021).However, the application of machine learning methods from a cognitivebehavioral perspective, however, has garnered less attention.Prior work has explored the use of machine learning to detection cognitive distortions in mental health texts or on social media (Shickel et al., 2020;Simms et al., 2017;Wang et al., 2022), as well as within medical dialogues between physicians and patients (Shreevastava and Foltz, 2021;Tauscher et al., 2023).The limited exploration in this field can be attributed to the lack of publicly available, well- annotated datasets.Previous studies on computational mental health primarily focus on emotions and symptoms.However, we believe that the impact of cognitive distortions and thinking patterns on mental health is also significant.

Dataset Construction
Previous researchers have collected posts from social media platforms or gathered data related to cognitive distortions through crowdsourced writing (Shickel et al., 2016;Alhaj et al., 2022).We have chosen not to directly annotate social media content due to the inability to effectively control its quality and accuracy.Instead, we have adopted a specially designed task to collect cognitive distortion thoughts.Unlike previous work that recruit volunteers widely from the internet, our data annotation process is executed through a collaborative effort between carefully selected and specifically trained volunteers and domain experts.As depicted in Figure 2, we will further describe our task design and the three phases, which include volunteer recruitment and screening, data annotation, and expert evaluation.

Data Annotation Target
Psychologists have identified various categories of cognitive distortions that are often exhibited in individuals' thought (Beck, 1970(Beck, , 1979;;Ellis, 1994).In our work, the C2D2 dataset encompasses seven typical cognitive distortions, which are shown in Table 1.Our annotation task involves providing volunteers with scenes and requesting them to write down multiple possible cognitive dis-tortion thoughts based on the given scenes.An example is presented in Table 2, it simply displays the translated content of our dataset.It should be noted that different types of cognitive distortions do not strictly appear independently.However, for the purpose of simplifying the annotation process, we have treated it as a single-label task.The volunteers' goal is to generate instances that represent a single type of distortion.In situations where multiple cognitive distortions occur simultaneously, we ask volunteers to select the label of the dominant cognitive distortion.
We provide the following to volunteers: Scene: The company's project is about to be delivered, and you caught a cold at this critical juncture Table 2: An instance from C2D2 dataset.This example is translated into English.

Volunteer Recruitment & Screening
Psychology Questionnaire We recruit volunteers to participate in our data collection on cognitive distortions.They complete a cognitive distortion questionnaire (Covin et al., 2011).We select volunteers

情绪化推理：我难受的想死，我就不该养任何⽣ 物，养宠物都是令⼈难过的。
Emotional reasoning: I feel so miserable that I want to die.I shouldn't have kept any living being as a pet.
Having pets is just distressing.

读⼼术：如果当初有选择，它肯定不会愿意跟我 回家的，我辜负了它对我的信任。
Mindreading: If there had been a choice back then, she wouldn't have wanted to be my pet; I let her down.
...... based on their questionnaire results, prioritizing those who show a potential for exhibiting cognitive distortions in daily life.This strategy enables us to collect more authentic cognitive distortion thoughts.

类别：家庭问题
Volunteer Training To ensure our volunteers understand cognitive distortions and can recognize them in their daily lives, we provide a comprehensive psychology training program.This program includes an examination on the identification and understanding of common cognitive distortions.After two days of cognitive psychology training and an introduction to the harms of cognitive distortions, we retain 24 out of the initial 50 volunteers.These volunteers demonstrate an ability to identify cognitive distortions and acknowledge their harmful effects.All volunteers agree to our data collection and public disclosure requirements.2

Data Annotation
Scene Preparation As depicted in Figure 2, we prepare 7 categories of 450 daily scenes, including work issues (20%), interpersonal issues (20%), economic issues (10%), random negative events (5%), family issues (25%), physical stress (10%), and dis-crepancy between ideal and reality (10%) (Lazarus and Folkman, 1984).These categories cover various aspects of life that can trigger cognitive distortions.By providing these scenes, we encourage the volunteers to imagine themselves in these situations and describe how they would perceive the events from the perspective of someone experiencing cognitive distortion thoughts.Data Collection Volunteers are asked to adopt the perspective of an individual experiencing cognitive distortions and generate thoughts related to a specific cognitive distortion that may arise in the given scene.Moreover, volunteers are asked to label thoughts provided by other volunteers.The final cognitive distortion labels for our data are determined by majority vote among three volunteers, reducing the influence of individual bias.Our primary objective is cognitive distortion detection; therefore, volunteers are assigned overlapping events for data collection.

Expert Evaluation
Our annotated data undergo rigorous expert evaluate to ensure quality.back, are asked to revise their submissions, and may receive additional training.The evaluation criteria include: • Correctness: Experts examine if the assigned labels accurately indicate cognitive distortions.
• Reasonableness: They evaluate if this cognitive distortion aligns with the cognitive distortions that may occur in real-life thinking.
• Emotional Diversity: They verify if the data reflects a broad spectrum of emotions, highlighting the emotional complexity in human cognition.

Data Quality Assurance
To ensure the quality of C2D2, we undertake multiple evaluations.Initially, we compute the interannotator Kappa score among our volunteers in regard to our labels.The outcomes reflect a moderate degree of concordance, with a mean Kappa score of 0.67, signifying significant consensus among the annotators.Subsequently, we engage experts to examine our complete dataset.These experts are requested to grade the data on a 1 to 5 scale.The assessments return high scores in all categories, presenting mean ratings of roughly 4.7 for Correctness, 4.5 for Emotional Diversity, and 4.1 for Reasonableness.These elevated scores corroborate that our dataset is well-aligned with the task instructions and embodies the necessary characteristics to train a model proficient in detecting cognitive distortions.
4 Data Characteristics

Statistics
The provided dataset statistics are displayed in Table 4, showcasing the overall characteristics of the dataset.It can be seen that the label distribution of our dataset is relatively uniform.After the completion of data collection, our dataset comprises a total of 7,500 thoughts.These thoughts are divided into three sets for training, validation, and testing purposes, following an 8:1:1 ratio.

Related Datasets
As shown in Table 3, our C2D2 dataset has many advantages compared to previous works.It is, firstly, a publicly accessible resource within this domain that includes numerous texts recording individuals' thoughts in various scenes.Secondly, we have annotated each thought, not only indicating the presence of cognitive distortion but also categorizing them according to cognitive psychology (Beck, 1979).Finally, we've maintained data authenticity and privacy through comprehensive annotator training, backed by a stringent three-phase data collection process.
The data collection methods of previous works included social media platforms or gathered data related to cognitive distortions through crowdsourced writing.Due to the privacy and sensitivity of mental health data, collaborating with trained volunteers and domain experts to construct reliable synthetic data is a viable alternative.Our approach ensures data quality and authenticity while maximiz-ing the availability of open datasets for researchers, thereby breaking the barriers of data privacy and low data quality that currently exist.
Our data focuses on the Chinese language, particularly in China, a developing country.Previous researches on mental health predominantly conduct in developed regions with advanced psychological resources.However, mental health issues in developing countries often receive limited attention.In these countries, there is a greater need for affordable and dependable automated detection technologies compared to the availability of reliable mental health services in developed regions (Patel and Kleinman, 2003;Kohn et al., 2004;Chowdhary et al., 2014).

Cognitive Distortion Detection
We develop a model to detect cognitive distortions in text and assign specific categories to these cognitive distortions.This detection and categorization can facilitate interventions by psychologists and further analysis for treatment purposes.The task, Cognitive Distortion Detection, involves inputting text X containing cognitive distortions and generating a multi-class label y ∈ {0, 7}.Here, 0 denotes non-distorted, while the other values represent distinct categories of cognitive distortions.

Baseline Models and Results
To evaluate the task, we employ pretrained language models on our C2D2 dataset.Table 5 presents the results of the baseline models.Our objective is to provide researchers with a dataset and establish a benchmark.We finetune the different kind of Chinese versions of the pretained language models (Cui et al., 2021(Cui et al., , 2020)).Additionally, we assess the few-shot and zero-shot settings for LLM, such as ChatGPT (Ouyang et al., 2022), providing three examples for each class in few-shot learning. 3 The pretrained models demonstrate satisfactory performance.However, LLM's in-context learning performance does not match that of the finetuned models for this particular psychological task.Nevertheless, significant improvement is observed with a few examples in the few-shot learning setting.We have built a cognitive distortion detection model based on various models.We develop cognitive distortion detection model utilizing multiple 3 Please refer to Appendix B and Appendix C for more details.

Model
F models.We believe that tasks such as cognitive distortion detection, which demand specialized analytical capabilities and expertise rather than being universally accessible, warrant increased attention in the future.

Cognitive Distortion for Mental Health
After constructing the cognitive distortion detection task, we aim to introduce the importance of cognitive distortions in mental health.Previous computational research has concentrated on examining the emotions and symptoms exhibited by individuals with mental disorders.In contrast, our objective is to find out the underlying cognitive distortion.This section sims to examine two fundamental questions in order to underscore the significance of cognitive distortion detection.
Q1: How do thinking patterns differ between individuals diagnosed with mental disorders and the normal group?
In Section 6.2, we employ our cognitive distortion detection model to analyze social media posts from individuals diagnosed with depression and PTSD, as well as normal group without any self-reported mental health issues.From a cognitive psychology perspective, we aim to identify the cognitive distortions to detect differences between these groups.
Q2: Can these differences be utilized for mental disorder detection?
In Section 6.3, we simply integrate the cognitive distortions into the method for detecting mental disorders, with a specific focus on depression and PTSD.This approach underscores the potential utility of cognitive distortion to improve mental disorder detection, particularly when conventional research largely concentrates on symptoms and emotional manifestations.The original C2D2 dataset is in Chinese.Given that most existing mental health datasets are in English, we have translated the C2D2 into English, producing the C2D2-E (English version of C2D2).We have used machine translation tools for this purpose and performed sampling inspections to ensure quality.The label distribution and content of C2D2-E mirror those of the original C2D2 dataset.

Cognitive Distortions Analysis on Social Media
In this section, we use the BERT-based baseline model, trained on the C2D2-E dataset, to detect cognitive distortions in social media posts from individuals diagnosed with depression and PTSD, as well as normal group.We focus on determining the prevalence of cognitive distortions on social media platforms.We define cognitive distortion prevalence as p f req , which is computed as follows: where N nor represents the number of normal post data, and N cd represents the number of posts containing cognitive distortions.We calculate the p f req for each user and present the results in a box plot.
Normal Depression PTSD Figure 3 illustrates that the p f req of cognitive distortions expressed on social media by individuals diagnosed with depression is higher than that of both the general population and individuals with PTSD using the CLPsych-2015 dataset. 4Among individuals diagnosed with depression, PTSD, and the normal group, the average p f req is 13% for depression, 8% for PTSD patients, and 3% for the normal group.Our results indicate that cognitive distortions are most prominent in patients with depression, followed by those with PTSD, compared to individuals without any reported mental disorders.While cognitive distortions exist among the normal group, they are less prevalent compared to individuals diagnosed with mental disorders.By comparing p f req across various mental disorders, we validate the relationship between cognitive distortions and mental disorders, indicating that an increase in cognitive distortions could potentially serve as an overlooked characteristic for detecting mental disorders.Figure 4 shows our analysis of the average distribution of cognitive distortion types related to PTSD and depression using the CLPsych-2015 dataset.Our results reveal distinct cognitive distortion patterns across different mental disorders.Specifically, individuals diagnosed with depression show a heightened tendency towards emotional reasoning and black and white thinking.In contrast, patients diagnosed with PTSD display a significant occurrence of labeling.This unexpected finding suggests that PTSD patients may excessively categorize benign situations as potential threats due to their traumatic experiences (Brewin, 2001;Van der Kolk, 2022).Our study of social media data underscores the importance of investigating cognitive distortions, potentially contributing to the improvement of therapeutic methods for treating PTSD.Moreover, our results highlight the feasibility of using computational techniques to analyze underlying psychological phenomena from data.

Enhancing Mental Disorder Detection via Cognitive Distortion
After discovering the aforementioned phenomenon, we attempted to incorporate cognitive distortion features into a simple mental disorder detection model.Our approach involves analyzing a user's posting history, denoted as P , and extracting posts that contain cognitive distortions, denoted as C, using a finetuned BERT model.(e.g., "Overgeneralization: I failed an exam, so the whole discussion will fail.").By integrating cognitive distortions into the detection process, we aim to improve the performance of detecting mental disorders.Method Our approach involves utilizing an LSTM to capture the user's historical context and model cognitive distortions.The resulting hidden states are then combined with post popularity in a feedforward neural network for final classification.The simple formula is shown as follows: where h p ∈ R 128 .We incorporate content features and numerical features related to cognitive distortion in our model.Our experiment was very straightforward, but we aim to demonstrate the role of cognitive distortion through this direct approach.
Result We conduct a preliminary evaluation of incorporating cognitive distortion information into the detection of mental disorders.The results show some improvement in detection performance when including users' cognitive distortion information in the model.This approach is effective across different platforms and in detecting various mental disorders.In various mental disorders across different platforms, the small modifications we made by incorporating cognitive distortions yielded superior results compared to the original model.We believe that our experiment demonstrates the potential to enhance the performance of existing mental disorder detection models through modeling cognitive distortions.

Conclusion
We have introduced the C2D2 dataset, which is the first public Chinese dataset focused on cognitive distortions.This dataset illuminates a profound connection between cognitive distortions and mental disorders.Addressing these distortions could potentially enhance mental health interventions.By integrating cognitive distortions as an additional feature, we have augmented the performance of existing mental disorder detection models.The introduction of the C2D2 dataset represents a valuable contribution to computational psychology and serves as a catalyst for further research in this emerging field.

Limitation
While collecting data, we try our best ensure its quality.However, our data may still have potential biases.Nevertheless, the data's inherent value remains unchanged.
The observed phenomena in mental illness data are validated using two mental health datasets, with the comparison between PTSD and depression based on a single dataset.These datasets may have biases, but gathering diverse user-level data on various mental illnesses is challenging for us.Therefore, we don't claim our findings as definitive in psychology.However, we believe that these phenomena are likely to be widespread.We encourage future researchers to analyze them further with more data.Recognizing and interpreting the data require additional endorsement from psychology experts, beyond the scope of our work.
We believe that there are many areas within our dataset that can be further explored and utilized, and our work only covers a small portion that we consider representative but limited.9 Broader Impact and Ethical Considerations

Ethical Considerations
When dealing with sensitive data such as the psychological well-being of human subjects, special care must be taken.In this case, our main goal is to provide a dataset for the general public, which makes confidentiality even more important.
Our research has received approval from the Institutional Review Board (IRB) of our institution.All data annotators involved are over 18 years old and have signed informed consent forms agreeing to the public release of the data.We have removed any data that may be personally identifiable to the data subjects.Additionally, during the data annotation process, we had a mental health expert monitoring the psychological well-being of the volunteers to ensure the well-being of the annotators.Volunteers had the option to withdraw from the process at any time.Finally, the remaining data we used is from publicly available datasets, and the ethical considerations for these datasets have been guaranteed by the dataset creators.

Positive Outcomes
• Improved Mental Health Services: Therapists and psychologists could use these techniques to identify cognitive distortions quickly, allowing them to dedicate more time to therapeutic processes. •

Negative Outcomes and Mitigation Strategies
• Misinterpretation of Data: Automated systems can make mistakes, and those mistakes could have serious consequences in the realm of mental health.An incorrect interpretation of a person's statements could lead to unnecessary worry or intervention, or it might miss a person who genuinely needs help.
• Over-reliance on Technology: While our work could significantly aid mental health professionals, there's a risk that some may become over-reliant on these computational techniques.Human judgment and intuition remain essential in mental health services, and a balance between technological and human input must be maintained.

A Ensuring Volunteer Well-being
To prioritize the mental health of our volunteers, we have implemented the following measures throughout the data collection process: • Precautions: Prior to participation, volunteers are provided with detailed information about the task, including potential exposure to challenging situations and cognitive distortions.
• Informed Consent: Volunteers provide informed consent before engaging in the data collection process.We emphasize that their participation is entirely voluntary, and they have the freedom to withdraw at any time without facing any consequences.
• Supportive Environment: We maintain open channels of communication with volunteers, encouraging them to share any concerns or difficulties they may encounter during the task.
• Anonymity and Confidentiality: Volunteers are assured that their identities will remain anonymous and their personal information will be kept confidential.This fosters a safe space for open and honest participation.
By implementing these measures, we demonstrate our commitment to the well-being of our volunteers and ensure an ethical and responsible data collection process.

B Training Details
We mention our testing of the large language model and finetuning of the pretrained model.Specifically, our prompt using in LLM and the training details are as follows.For Mental Disorder Detection, both nonpretrained models and the pretrained model, in their base versions, are utilized.The learning rate for non-pretrained models is set to 1e-3, while the pretrained model has a learning rate of 1e-6.The LSTM hidden layer state is configured with 128 dimensions to capture complex temporal dependencies.Similar to the Cognitive Distortion Detection, the AdamW optimizer is employed.The official test set is used for evaluation, and a separate validation set is created from the training set.The model exhibiting the lowest loss on the validation set is selected for testing.These hyperparameter settings and model choices are determined through empirical evaluation and existing literature on similar tasks.

C Detailed Cognitive Distortion Detection Results
As shown in Table 8, we also provide some experimental results for C2D2-E.Similar to the results for C2D2, but the results for C2D2-E are generally higher.Apart from the impact of translation, we attribute this to the improved ability of the existing model to comprehend English, especially for today's LLM.As shown in Figure 5, we are presenting here the results obtained on the eRisk-2018 dataset, and without a doubt, the differences in cognitive distortions between the depression group and the normal group are once again evident.In this dataset, the proportion of normal users exhibiting cognitive distortions is higher than that in the Clpsych-2015 dataset.

Figure 1 :
Figure 1: Cognitive distortions' impact on individuals.Cognitive distortions are constantly strengthened within this reinforcing loop, contributing to the development of mental disorders such as depression, PTSD, and anxiety (Burns, 1981).
Scene category: Work issues The content completed by volunteers is as follows: Cognitive Distortion: Cold & overtime, bad luck peaks!Why all misfortune on me!(Overgeneralization) Cognitive Distortion: Is my illness at such a critical moment a sign that something will go wrong with the project as well?(Fortune-telling) Non-distorted: Feeling terrible with a cold -my nose and throat are both sore.So painful!(Non-distorted) . . .At least 5 different types of cognitive distortions and 2 normal thought.

Figure 2 :
Figure2: The Data Collection Process includes volunteer recruitment&screening, data annotation, and conducting a final expert evaluation.During the volunteer recruitment phase, we carefully select volunteers using a cognitive distortion questionnaire.The data annotation collect cognitive distortions.The expert evaluation guaranteed the reliability of our data.

Figure 3 :
Figure 3: Box plot of the distribution of p f req across different groups.The vertical axis represents the p f req of cognitive distortions, and the horizontal axis represents the mental disorders of the users.

Figure 4 :
Figure 4: Category Differences in Cognitive Distortions among Different Mental Disorders.MR stands for Min-dReading, FT stands for Fortune-Telling, BW stands for Black and White thinking, LL stands for Labeling, OG stands for Overgeneralization, PZ stands for Personalization, and ER stands for Emotional Reasoning.

Description:
Cognitive distortions refer to detrimental patterns of thinking that are prevalent in individuals.Assistance is required to identify these patterns of thinking based on the textual content.The following are the distinct categories of cognitive distortions along with their precise definitions: Definite: Definitions of seven types of cognitive distortions Examples: A total of twenty-four illustrative instances Thought: Input data For Cognitive Distortion Detection, the learning rate is set to 1e-5, and the AdamW optimizer is employed.The base version of the model, without any modifications, is used.To determine the best performing model, the learning rate and optimizer are chosen based on empirical observations and previous studies in the field.The model exhibiting the lowest loss on the validation set, which is separated from the training set, is selected for testing.The results are averaged over 5 runs to ensure robustness and mitigate the effects of random initialization.

2018 Figure 5 :
Figure 5: Box plot of the distribution of p f req across different groups on eRisk-2018.

Table 1 :
Definitions and Examples of Cognitive Distortions.Our data contains seven common cognitive distortions.
When a negative event occurs, people may think that more bad things are about to happen and see the negative event as the beginning of a pattern.How many similar thoughts like this do you have in your life?
P1: Volunteer Recruitment & Screening Eliminated: Questionnaire score is below 15.Which means harder to generate cognitive distortion text.P2: Data Annotation Question: … Retained: Questionnaire score is above 15.Which means easier to generate cognitive distortion text.

Table 3 :
Comparison of the key features of our work with previous studies.

Table 5 :
Performance of baseline models for the C2D2 tasks.All metrics are calculated using macro-averaging.

Table 7 :
Comparison of performance metrics for different models on multiple datasets.The method we use and the obtained values are shown in bold.