PAL: Persona-Augmented Emotional Support Conversation Generation

Due to the lack of human resources for mental health support, there is an increasing demand for employing conversational agents for support. Recent work has demonstrated the effectiveness of dialogue models in providing emotional support. As previous studies have demonstrated that seekers' persona is an important factor for effective support, we investigate whether there are benefits to modeling such information in dialogue models for support. In this paper, our empirical analysis verifies that persona has an important impact on emotional support. Therefore, we propose a framework for dynamically inferring and modeling seekers' persona. We first train a model for inferring the seeker's persona from the conversation history. Accordingly, we propose PAL, a model that leverages persona information and, in conjunction with our strategy-based controllable generation method, provides personalized emotional support. Automatic and manual evaluations demonstrate that PAL achieves state-of-the-art results, outperforming the baselines on the studied benchmark. Our code and data are publicly available at https://github.com/chengjl19/PAL.


Introduction
A growing number of people are experiencing mental health issues, particularly during the Covid-19 pandemic (Hossain et al., 2020;Talevi et al., 2020;Cullen et al., 2020;Kumar and Nayar, 2021), and more and more people are seeking mental health support.The high costs and limited availability of support provided by professional mental health supporters or counselors (Kazdin and Blase, 2011;Olfson, 2016;Denecke et al., 2020;Peterson, 2021) have highlighted the importance of employing conversational agents and chatbots for automating this task (Cameron et al., 2018;Daley et al., 2020;Denecke et al., 2020;Kraus et al., 2021).Towards this end, Liu et al. (2021) pioneered the task of emotional support conversation generation to reduce users' emotional distress and improve their mood using language models.They collected ESConv, a high-quality crowd-sourced dataset of conversations (with annotated helping strategies) between support seekers and trained emotional supporters, and demonstrated that training large pretrained dialogue models on this dataset enabled these models to provide effective support.Tu et al. (2022) proposed to leverage commonsense knowledge and implemented hybrid strategies to improve the performance of dialogue models in this task.Similarly, Peng et al. (2022) also suggested using commonsense knowledge for this task and further proposed a global-to-local graph network to model local and global hierarchical relationships.More recently, Cheng et al. (2022) proposed look-ahead strategy planning to select strategies that are more effective for long-turn interactions.
Although previous studies have considered relevant psychological theories and factors, such as commonsense reasoning, they neglect information regarding the users' persona.Persona, which can be considered as an outward expression of personality (Leary and Allen, 2011) in Psychology, is also closely related to empathy (Richendoller and Weaver III, 1994;Costa et al., 2014), anxiety (Smrdu et al., 2021), frustration (Jeronimus and Laceulle, 2017), mental health (Michinov and Michinov, 2021) and distress (Liu et al., 2018), all of which are essential concepts in psychological scenarios.Effective emotional support benefits from an adequate understanding of the support seeker's personality, as shown by research on person-centered therapy (Rogers, 2013), while more specific and persona-related words lead to a long-term rapport with the user (Campos et al., 2018).Thus, the inability to actively combine persona information and conversations prevents users from developing such rapport with the system (Xu et al., 2022), which is not desirable for emotional support.Therefore, it is intuitive to explore seekers' personas and build systems for providing personalized emotional support.
In this paper, we propose Persona-Augmented EmotionaL Support (PAL), a conversational model that learns to dynamically leverage seekers' personas to generate more informative and personalized responses for effective emotional support.To more closely match realistic scenarios (no prior knowledge of the user's persona) and retain important user information from earlier conversation rounds, we first extract persona information about the seeker based on the conversation history and design an attention mechanism to enhance the understanding of the seeker.Furthermore, we propose a strategy-based controllable generation method to actively incorporate persona information in responses for a better rapport with the user.We conduct our experiments on the ESConv dataset (Liu et al., 2021).Our results demonstrate that PAL outperforms the baselines in automatic and manual evaluations, providing more personalized and effective emotional support.We summarize our contributions as follows: • To the best of our knowledge, our work is the first approach that proposes to leverage persona information for emotional support.
• We propose a model for dynamically extracting and modeling seekers' persona information and a strategy-based decoding approach for controllable generations.
• Our analysis of the relationship between the degree of individuality and the effect of emotional support, in addition to the conducted experiments on the ESConv dataset and comparisons with the baselines, highlights the necessity and effectiveness of modeling and leveraging seekers' persona information.
2 Related Work

Persona in Conversation Generation
There are extensive studies on leveraging persona information in dialogue (Huang et al., 2020).However, it's important to note that the definition of persona in this context differs from its definition in Psychology.In dialogue systems, persona refers to the user's characteristics, preferences, and contextual information, which are incorporated to enhance the system's understanding and generation capabilities.Li et al. (2016b) proposed using persona embeddings to model background information, such as the users' speaking style, which improved speaker consistency in conversations.However, as stated by Xu et al. (2022), this approach is less interpretable.Therefore, several approaches to directly and naturally integrate persona information into the conversation were proposed (Zhang et al., 2018;Wolf et al., 2019;Liu et al., 2020;Yang et al., 2021).Zhang et al. (2018) collected PERSONA-CHAT, a high-quality dataset with annotated personas for conversations collected by crowd-sourcing workers.This dataset has been widely used to further explore personalized conversation models and how persona could benefit response generation in conversations (Wolf et al., 2019;Liu et al., 2020;Yang et al., 2021).However, it is relatively difficult to implement users' personas in real-world applications, as requiring users to provide information regarding their personas prior to conversations is impractical and unnatural.Xu et al. (2022) addressed this problem by training classifiers that determine whether sentences in the conversation history include persona information.Accordingly, they store such sentences and leverage them to generate responses.However, in many cases, users do not explicitly express persona information in the conversation, which often requires a certain level of reasoning.For instance, a user may say, "My friend likes to play Frisbee, so do I", which does not contain any explicit persona information, but one could infer that the user likes to play Frisbee.In this work, we aim to infer possible persona information from the conversation history to assist our model in better understanding the user.

Emotional Support
In recent years, an increasing number of approaches have focused on emotional and empathetic response generation (Zhou et al., 2018;Zhong et al., 2020;Kim et al., 2021;Gao et al., 2021a;Zheng et al., 2021;Sabour et al., 2022b).However, although such concepts are essential, they are insufficient for providing effective support as this task requires tackling the user's problem via various appropriate support strategies while exploring and understanding their mood and situation (Liu et al., 2021;Zheng et al., 2022).Therefore, Liu et al. (2021) proposed the task of Emotional Support Conversation Generation and created a set of high-quality conversations between trained crowd-sourcing workers.Their work demonstrated that training widely-used dialogue models, such as Blenderbot (Roller et al., 2021), on their collected dataset enabled such models to provide effective emotional support.Following their work, Tu et al. (2022) proposed leveraging external commonsense knowledge to better understand the users' emotions and suggested using a mixture of strategies for response generation.Peng et al. (2022) implemented a hierarchical graph network to model the associations between global causes and local intentions within the conversation.Cheng et al. (2022) proposed multi-turn strategy planning to assist in choosing strategies that are long-term beneficial.However, existing work has not explored the effects of dynamically modeling users' persona information in this task, which we hypothesize improves models' emotional support ability and enables more personalized support.

Persona-Augmented Emotional Support
Figure 2 shows the overall flow of our approach.We first infer the seeker's persona information from the conversation history and then leverage the inferred information to generate a response.Our approach is comprised of three major components: The persona extractor for inferring the seeker's persona information ( §3.2);The response generator that leverages the inferred persona information and generates the response distribution ( §3.3);A strategy-based controllable decoding method for generating appropriate responses ( §3.4).
Figure 2: The overall structure of Persona-Augmented Emotional Support (PAL).We extract the seeker's persona from the dialogue history and then use a controllable generation method to generate the response.α is a tunable hyperparameter.

Problem Formulation
For inferring users' personas, we leveraged the PERSONA-CHAT dataset (Zhang et al., 2018), a high-quality collection of conversations between crowd-sourcing workers assigned with a set of predefined persona sentences.Assume that a conversation between two speakers A and B is represented as and u B i represent the respective utterances of each speaker in the conversation, and n indicates the number of utterances.Accordingly, assume that each speaker has a set of persona information P A = {p A 1 , . . ., p A m A } and P B = {p B 1 , . . ., p B m B }, where p A i and p B i represent the persona sentences for each speaker, respectively.Our pioneer task is to infer a speaker's persona information based on their utterances in the conversation (e.g., inferring As mentioned, we adopt the ESConv dataset (Liu et al., 2021) to train our model for provid-ing emotional support.Assume that a conversation between a support seeker A and supporter B at the t th turn of the conversation is D , where u A i and u B i represent the utterances of the seeker and the supporter, respectively.Our task is two-fold: First, we infer the seeker's persona information P A from their utterances Accordingly, we leverage the inferred information P A and conversation history D to generate an appropriate supportive response u B t .

Persona Extractor
As previously stated, it is beneficial and essential to study the effects of leveraging persona information in the emotional support task.As predicting the seeker's persona information before the conversation is impractical, inferring such information from their utterances is necessary.
Based on the problem formulation in §3.1, we fine-tune a bart-large-cnn1 to augment the ESConv (Liu et al., 2021) dataset with the inferred persona information annotations for each turn of the conversations.More details can be found in Appendix A. Since the initial utterances of this dataset generally contain greetings, we annotate the persona information starting from the third utterance of the conversation.Table 1 shows an example of such annotations.We refer to this dataset with the additional annotations as Personalized Emotional Support Conversation (PESConv).
We analyze PESConv to confirm that modeling persona is essential for emotional support.In the original ESConv dataset, workers score conversations based on the supporter's empathy level, the relevance between the conversation topic and the supporter's responses, and the intensity of the seeker's emotion.For each of these three aspects, we calculate the average cosine similarity between the responses and persona information in a conversation to examine how closely the responses and persona information are related.
For this task, we leverage SimCSE (Gao et al., 2021b), a sentence embedding model trained with a contrastive learning approach, to obtain vector representations for the sentences in PESConv.As illustrated in Figure 3, clearer and more appropriate mentions of the seekers' persona in the supporters' response lead to higher values for the studied as-pects (i.e. higher empathy, more relevance, and a larger decrease in emotional intensity).Therefore, we believe this further highlights the necessity of modeling persona information in providing effective emotional support.Moreover, we use fastText (Joulin et al., 2017), which represents sentences as averaged word embeddings, and the results (Appendix B) demonstrate similar findings.

Modeling Seekers' Persona
As illustrated in Figure 2, our model considers persona information as the model input in addition to the dialogue history.Formally, we use Transformer (Vaswani et al., 2017) encoders to obtain the inputs' hidden representations, which can be expressed as where Enc is the Transformer encoder, and m and n represent the number of persona sentences and conversation utterances, respectively.We use the special token SEP for sentence separation.
To highlight the context related to seekers' persona, we calculate an extra attention Z D on H D and obtain a new hidden representation ĤD for dialogue history as follows: where LN stands for the LayerNorm operation (Ba et al., 2016).Similarly, to promote persona sentences that are more aligned with the provided context, we obtain ĤP by (3) This also enables us to neglect the inferred persona sentences that are incorrect or irrelevant to the dialogue history.Since we cannot guarantee that inferred persona information is complete, we calculate the weighted sum of ĤD , ĤP and H D to obtain the final hidden states as the decoder's input as follows: where w 1 , w 2 , w 3 are additional model parameters with the same initial value.This ensures that the Conversation Persona Seeker: Hello -Supporter: Hi there!How may I support you today?-Seeker: I'm just feeling anxious about my job's future.A lot of my colleagues are having trouble getting their licenses because of covid which means we won't be able to work.
I am worried about my job's future.
Supporter: That must be hard.COVID has turned our world upside down!What type of occupation are you in?
I am worried about my job's future.
Seeker: I'm studying to be a pharmacist.I am worried about my job's future.I'm studying to be a pharmacist.Figure 3: The relationship between the empathy score, relevance score, emotion intensity decrease score, and the similarity between supporters' responses and persona information using SimCSE.We can observe that in general, more similarity leads to higher scores.In addition, we display the trend line and the coefficient of determination.
essence of the original dialogue context is largely preserved.Similar to (Liu et al., 2021), we use special tokens to represent strategies and append them in front of the corresponding sentences.Our training objective can be formalized as: where s stands for the strategy, r for the response, and N is the length of r.

Strategy-based Controllable Generation
Supporters' responses in the emotional support task are annotated based on several support strategies, which are essential for providing effective support (Liu et al., 2021).For instance, the supporter may choose to ask a Question or provide statements of Reaffirmation and Confirmation depending on the situation.We provide more descriptions of these strategies in Appendix C. Accordingly, it becomes intuitive that selecting different strategies corresponds to the available knowledge of the users' persona, demonstrating the importance of strategy selection in our proposed approach.For instance, supporters could choose Providing Suggestions if they have sufficient knowledge of the user's persona and situation, while they would resort to Question if they lack such information.Therefore, we propose an innovative strategy-based controllable generation method for the decoding phase.We decompose the generation probability into of a token whether the persona is entered or not.
As the ratio increases, the token becomes more relevant to persona information, increasing the likelihood of generating the token after adding such persona information.Therefore, employing Eq.6 increases the likelihood of more relevant tokens to the persona information.α is set to different values depending on the strategy.The values used by all strategies are listed in Table 2.
We investigate the values of α corresponding to different strategies and define three categories: high, medium, and low, which correspond to 0.75, 0.375, and 0, respectively.More details about the tuning process of these values are discussed in Appendix D.
We provide explanations for two of our decided α values.For effective support, there are two types of questions (Question strategy) that can be asked from the seeker (Ivey et al., 2013): open and closed.Therefore, we choose the low level to avoid overthinking persona information, resulting in fewer open questions.We chose the high level for the Providing Suggestions strategy, as we needed to focus more on the persona information to provide more appropriate and specific suggestions.See Appendix E for explanations regarding the α of other strategies.

Persona Extractor Evaluation
Human Evaluation To validate the effectiveness of our persona extractor model, we first manually reviewed several inferences and discovered that the main errors could be categorized as contradictions (i.e., personas contain factual errors) or hallucinations (i.e., personas contain unreasonable and irrelevant deductions from the conversation).An example of contradictions would be if the seeker mentions in the conversation that he is a man, but the inferred persona is "I am a woman".Moreover, an instance of hallucination errors would be if the inferred persona is "I am a plumber" when the seeker has not mentioned their occupation.Then, we chose 100 samples at random and hired workers on Amazon Mechanical Turk (AMT) to annotate each sample with one of the following four options: Reasonable, Contradictory, Hallucinatory, or Others.In addition, if the option Others was chosen, we asked workers to elaborate on the error.The annotators considered 87.3% of the inferred persona samples as Reasonable while marking 8% and 4% of the samples as Contradictory and Hallucinatory, respectively.Moreover, only 0.667% of the samples were marked as Others.However, upon further analysis, we found that such samples could also be classified in one of the mentioned error categories (see Appendix F for more details).The inter-annotator agreement, measured by Fleiss's kappa, was 0.458, indicating moderate agreement.

Baselines
Blenderbot-Joint (Liu et al., 2021): Blenderbot (Roller et al., 2021) fine-tuned on the ESConv dataset.This model is trained to predict the correct strategy for the next response via the language modeling objective.In addition, this model can also be seen as PAL trained without incorporating persona.
MISC (Tu et al., 2022): the state-of-the-art (SOTA) on the ESConv benchmark, which leverages commonsense reasoning to better understand the seeker's emotions and implements a mixture of strategies to craft more supportive responses.
Hard Prompt: this model employs a straightforward idea when modeling seekers' persona information in the emotional support task, in which persona information is concatenated to the dialogue history.That is, the input to the model would be in the form "Persona: {persona} \n Dialogue history: {context} \n Response: ".

Implementation Details
We conducted the experiments on PESConv and use a 7:2:1 ratio to split this dataset into the train, validation, and test sets.As Liu et al. (2021) stated, Blenderbot (Roller et al., 2021) outperforms Di-aloGPT (Zhang et al., 2020) in this task.Therefore, similar to previous work (Liu et al.,  et al., 2022), we used the 90M version of Blenderbot (Roller et al., 2021).Moreover, we used the AdamW (Loshchilov and Hutter, 2018) optimizer with β 1 = 0.9 and β 2 = 0.999.We initialized the learning rate as 2.5e-5 and performed a 100step linear warmup.The training and validation batch sizes were set to 4 and 16, respectively.The model was trained for 10 epochs, and we chose the checkpoint with the lowest loss on the validation set.During the decoding phase, we used both Topk and Top-p sampling with k = 10, p = 0.9, with temperature and the repetition penalty set to set to 0.5 and 1.03, respectively.The experiments were run on a single Quadro RTX 6000 GPU using the transformers library2 (Wolf et al., 2020).

Automatic Evaluation
We adopted strategy prediction accuracy (ACC), perplexity (PPL), BLEU-n (B-n) (Papineni et al., 2002), Distinct-n (D-n) (Li et al., 2016a), EAD-n (E-n) (Liu et al., 2022), Rouge-L (R-L) (Lin, 2004), and the mean of the cosine similarity between supporters' responses and personas using the SimCSE (Gao et al., 2021b) representation (cos-sim) to automatically evaluate our model's performance.In addition, since the responses in this task are often long, we also leveraged the Expectancy-Adjusted Distinct (EAD) score to evaluate response diversity as the Distinct score has been shown to be biased towards longer sentences (Liu et al., 2022).To calculate this score, rather than dividing the number of unique n-grams by the total number of n-grams, as done in the original Distinct score, we would use the model's vocabulary size as the denominator.
As shown in Table 3, PAL outperforms all baselines in automatic metrics, including the current SOTA model MISC.As Blenderbot-Joint can be perceived as PAL without persona employed in training, the significance of persona can be demonstrated through the comparison of the results achieved by PAL and PAL (α = 0) with Blenderbot-Joint.In addition, compared to PAL (α = 0), PAL demonstrates a more balanced performance and has the best strategy prediction accuracy, diversity, and better alignment with persona information, which indicates more seeker-specific responses.Interestingly, the cos-sim value for PAL is comparable to the mean value of the dialogues with an empathy score of 5 in Figure 3(a).Through further comparing the performance of PAL and PAL (α = 0), we can see that our strategy-based decoding approach significantly improves the dialogue diversity, as shown by D-n and E-n, which are more important metrics for dialogue systems than B-n and R-L (Liu et al., 2016;Gupta et al., 2019;Liu et al., 2022).
In Figure 4

Ground-truth
You have got such a nice girlfriend, have a happy life with her.
Table 5: Responses from our approach and others.Due to space constraints, we have omitted some sentences.
with persona information, PAL, PAL (α = 0), and Hard Prompt, all outperform MISC, demonstrating the importance of seekers' persona and highlighting the need for further research into how to better leverage such information in addition to commonsense reasoning.

Human Evaluation
We acknowledge that automatic metrics are insufficient for empirically evaluating and highlighting the improvements of our proposed method.Hence, following Liu et al. (2021), we also conducted human evaluation by recruiting crowd-sourcing workers that interacted with the models.We provided workers with a scenario and asked them to act as seekers in those situations.Each worker must interact with two different models and score them in terms of (1) Coherence; (2) Identification; (3) Comforting; (4) Suggestion; (5) Informativeness; and (6) Overall Preference.Detailed explanations for each aspect can be found in Appendix F. As shown in Table 4, we compare PAL with the other three models, and PAL beats or is competitive with other methods on all of the above metrics.It performs well on three key metrics more closely aligned with persona (i.e., Comforting, Suggestion, and Information), implying that persona is required in emotional support.

Case Study
In Table 5, we provide an example to compare the responses of our approach with the other methods.As can be seen, the Blenderbot-Joint, MISC, and Hard Prompt methods all provide only very poor empathy, with responses that are very general and do not contain much information.Whereas PAL (α = 0), which does not use the strategy-based decoding method, is more specific but provides a less appropriate suggestion.Our model PAL shows strong empathy, is the most specific while providing appropriate suggestions, and incorporates persona information in the response (feel ashamed and don't cheat on your girlfriend again).Due to space constraints, more cases, including cases of interactions and analysis over different strategies, can be found in Appendix G.

Conclusion
In this work, we introduced persona information into the emotional support task.We proposed a framework that can dynamically capture seekers' persona information, infer persona information using our trained persona extractor, and generate responses with a strategy-based controllable generation method.Through extensive experiments, we demonstrated that our proposed approach outperformed the studied baselines in both human and manual evaluation.In addition, we provided persona annotations for the ESConv dataset using the persona extractor model, which will foster the research of personalized emotional support conversations.

Limitations
Persona extractor First, we need to clarify that our definition of persona is not exactly psychological, the role an individual plays in life (Jung, 2013).As a result, like previous studies (e.g., Persona-Chat (Zhang et al., 2018), PEC (Zhong et al., 2020)), the format of persona is flexible and variable.As stated in §4.1, there are still some issues with the model we use to infer persona information.For example, we sometimes get information that contradicts the facts.And also, there is occasionally unrelated content, as with commonsense reasoning (Tu et al., 2022).Furthermore, we cannot guarantee that we can infer all of the persona information that appears in the conversation because much of it is frequently obscure.And when extracting persona information, we only use what the user said previously and remove what the bot said, which results in the loss of some conversation information.The reason for this is that we have discovered that if we use the entire conversation, the model frequently has difficulty distinguishing which persona information belongs to the user and which belongs to the other party.In addition, since the code of Xu et al. (2022) is not yet available, we have not compared other methods of extracting persona dynamically from the conversation.
Strategy-based decoding During the decoding phase, we only coarse-grained the α of each strategy because we discovered that only coarse-grained tuning produced good results, and future work may be able to further explore the deeper relationship between different strategies and persona.

Ethical Considerations
In this work, we leveraged two publicly available datasets.First, we used the Persona-Chat dataset, which is collected by assigning a set of fixed predefined persona sentences to workers.Therefore, by participating in this dataset, workers were required not to disclose any personal information (Zhang et al., 2018), which prevents issues regarding the leakage of their privacy.Similarly, during the collection of the ESConv dataset, participants were asked to create imaginary situations and play the role of a support seeker who is in that situation.In addition, they were instructed not to provide personal information during their conversations with the trained supporters (Liu et al., 2021).Regarding the persona extractor, this module is trained to infer and extract persona information solely from what the user has mentioned in the conversation rather than making assumptions about the user's background and character, further highlighting the importance of user privacy in our research.
Regarding our experiments, we ensured that all workers agreed to participate in the annotation tasks.Moreover, as the workers were recruited from the US, we ensured that they were paid above the minimum wage in this country for successfully completing our tasks.We acknowledge that using trained dialogue models to provide support is a sensitive subject and research on this topic should be conducted with sufficient precautions and supervision.We also acknowledge that in their current stage, such models cannot replace human supporters for this task (Sabour et al., 2022a).Thus, they should not be employed to replace professional counselors and intervention and interact with users that suffer from mental distress, such as depression or suicidal thoughts.

A Persona Extractor
In our initial experiments, we compare the effectiveness of various generative models to infer persona (such as GPT2 (Radford et al., 2019), Di-aloGPT (Zhang et al., 2020), BART (Lewis et al., 2020)).We manually checked some results and found the best results were obtained by the Bart model fine-tuned on CNN Daily Mail (Hermann et al., 2015).We trained this model for ten epochs with a batch size of 4 and learning rate of 1e-5, and selected the best-performing checkpoint.

B Relevance of Individualization and Seeker Evaluation
Here we show the results produced by fastText in Figure 5.

C Helping Strategies in ESConv
A total of 8 strategies are marked in ESConv, and they are basically evenly distributed (Liu et al., 2021).Here we list these strategies and their detailed definitions, which are directly adopted from Liu et al. (2021).
Question Asking for information related to the problem to help the help-seeker articulate the issues that they face.Open-ended questions are best, and closed questions can be used to get specific information.
Restatement or Paraphrasing A simple, more concise rephrasing of the help-seekers' statements could help them see their situation more clearly.
Reflection of Feelings Articulate and describe the help-seekers' feelings.
Self-disclosure Divulge similar experiences that you have had or emotions that you share with the help-seeker to express your empathy.
Affirmation and Reassurance Affirm the helpseeker's strengths, motivation, and capabilities and provide reassurance and encouragement.
Providing Suggestions Provide suggestions about how to change but be careful not to overstep and tell them what to do.
Information Provide useful information to the help-seeker, for example, with data, facts, opinions, resources, or by answering questions.
Others Exchange pleasantries and use other support strategies that do not fall into the above categories.

D Tuning Process of the α Values
We first tried to set these alpha values as trainable parameters, but we found that the values changed very little during the training of the model and therefore depended heavily on the initialization, so we set these alpha's as hyperparameters.
Then, these values were obtained upon numerous attempts on the validation set as they enabled the model to have a balanced performance based on the automatic evaluation.We acknowledge that this tuning process is trivial and coarse-grained.We leave approaches to improve this process, such as using a simulated annealing algorithm, to future work.

E Analysis of α Selected for Different Strategies
In §3.4,we analyzed the strategies Question and Providing Suggestions.And the rest of the strategies are analyzed below.
For the Restatement or Paraphrasing strategy, it is necessary to repeat the words of the seeker, so a more specific restatement can help the seeker better understand himself.For the Reflection of Feelings strategy, since the focus is more on feelings, and the extracted persona information is more fact-related, we set low for this strategy.For the Self-disclosure strategy, it is more about the supporter's own experience and should not focus too much on the persona information of the seeker, which may lead to unnecessary errors, so we set this strategy to low.For the Affirmation and Reassurance strategy, combining the seeker's persona information can often provide more specific encouragement and bring the seeker a better experience, so we set it to high.For the Information strategy, we need to consider more persona information in order to provide more appropriate and specific information for seekers, so we set it high.For the Other strategy, the main places this appear are greeting and thanking.About this strategy, considering that most appearances are in greeting and thanking, if we can combine more seeker characteristics may make seekers feel more relaxed, we set it to the high level at first, but careful observation found that Other strategies are used when the other strategies are not appropriate.Although such cases are rare, in order to avoid unnecessary errors, we set it to medium.

F Human Evaluation
Here we show the guidelines for two human evaluation experiments in Figure 6 and Figure 7.For the persona extractor manual evaluation experiment, we pay $0.05 for one piece of data, and for the human interactive evaluation, we pay $0.10 for one piece of data, with the price adjusted for the average time it takes workers to complete the task.We stated in the task description that this is an evaluation task, so for the data submitted by the workers, we only use it for evaluations.

G Case Study
Due to space limitations, we show more examples here, these are cherry-picked.
In Figure 8, we show an interactive case.It can be seen that PAL uses the extracted persona appropriately several times in the conversation and gives the seeker specific advice.
In Figure 9, we show some cases in the ES-Conv dataset.Interestingly, in these examples, PAL sometimes performs better than Ground-truth, giving a more appropriate and specific response rather than a general one, which also proves the superiority of our model.
Here, we also compare our model with baselines over different strategies.In Table 6, we show a case of the strategy Providing Suggestions.We can find that our model provides the most specific suggestions.In Table 7, we show a case of the strategy Affirmation and Reassurance.We can also see that PAL's response is the most specific.I can see how that can be a problem.That is a very difficult situation to be in.

PAL
That's a difficult situation to be in.It sounds like you are being betrayed.I believe you deserve someone better.
Ground-truth I can imagine how a break in trust has made you feel.I know a break in trust is always very difficult to process.

Figure 1 :
Figure 1: An example from the ESConv dataset, in which the trained supporter extracts key information about the seeker's persona and leverages this information to provide effective emotional support.

Figure 4 :
Figure 4: The top-n strategy prediction accuracy.

Figure 7 :
Figure 7: The screenshot of human interaction evaluation guideline.

Figure 8 :
Figure 8: An interactive example.In this case, the orange dialog box is the supporter (PAL) and the green dialog box is the seeker.Due to the long dialogue rounds, it is split into left and right sides, with the dialogue on the left side going first.The persona used by PAL is given below the dialog box.

Figure 9 :
Figure 9: Some examples in the ESConv dataset where we do not show all rounds of dialogues due to space constraints.The orange dialogues are from supporters (PAL).

Table 1 :
An example conversation from PESConv.This conversation contains 5 utterances, where "-" indicates that no persona information was found.Once detected, new inferences are added to the seekers' persona.

Table 2 :
r t |r <t , d, p) ∝P (r t |r <t , d, p)• The values and levels of α corresponding to different strategies.
and persona, respectively.Both P (r t |r <t , d, p) and P (r t |r <t , d) are calculated by our model; the only difference is that persona is not included in calculating P (r t |r <t , d).The last term in this equation can be interpreted as the ratio of the probability

Table 3 :
2021; Tu The results of automatic metrics evaluation for each model on ESConv.PAL (α = 0) represents setting the α of each strategy to 0, thus neglecting our proposed controllable generation decoding method.
, we show the accuracy of the top-n strategy prediction results and our model PAL has the best results.It is worth noting that all models I understand it is hard, so now you have to forgive her.(Less Proper Suggestion) PAL I understand how that can be hard.I would suggest you to talk to her mother, tell her that you feel ashamed about it and don't cheat on your girlfriend again.(Strong Empathy)

Table 7 :
Responses from our approach and others.