XPersona: Evaluating Multilingual Personalized Chatbot

Personalized dialogue systems are an essential step toward better human-machine interaction. Existing personalized dialogue agents rely on properly designed conversational datasets, which are mostly monolingual (e.g., English), which greatly limits the usage of conversational agents in other languages. In this paper, we propose a multi-lingual extension of Persona-Chat, namely XPersona. Our dataset includes persona conversations in six different languages other than English for evaluating multilingual personalized agents. We experiment with both multilingual and cross-lingual trained baselines and evaluate them against monolingual and translation-pipeline models using both automatic and human evaluation. Experimental results show that the multilingual trained models outperform the translation pipeline and that they are on par with the monolingual models, with the advantage of having a single model across multiple languages. On the other hand, the state-of-the-art cross-lingual trained models achieve inferior performance to the other models, showing that cross-lingual conversation modeling is a challenging task. We hope that our dataset and baselines will accelerate research in multilingual dialogue systems.


Introduction
Personalized dialogue agents have been shown efficient in conducting human-like conversation.This progress has been catalyzed thanks to existing conversational dataset such as Persona-chat (Zhang et al., 2018;Dinan et al., 2019a).However, the training data are provided in a single language (e.g., English), and thus the resulting systems can perform conversations only in the training language.For wide, commercial dialogue systems are required to handle a large number of languages since the smart home devices market is increasingly international (Etherington, 2019).Therefore, creating multilingual conversational benchmarks is essential, yet challenging since it is costly to perform human annotation of data in all languages.
A possible solution is to use translation systems before and after the model inference, a two-step translation from any language to English and from English to any language.This comes with three major problems: 1) amplification of translation errors since the current dialogue systems are far from perfect, especially with noisy input; 2) the three-stage pipeline system is significantly slower in terms of inference speed; and 3) high translation costs since the current state-of-the-art models, especially in low resources languages, are only available using costly APIs.
In this paper, we analyze two possible workarounds to alleviate the aforementioned challenges.The first is to build a cross-lingual transferable system by aligning cross-lingual representations, as in Conneau et al. (2018), in which the system is trained on one language and zero-shot to another language.The second is to learn a multilingual system directly from noisy multilingual data (e.g., translated data), thus getting rid of the translation system dependence at inference time.
To evaluate the aforementioned systems, we propose a dataset called Multilingual Persona-Chat, or XPersona, by extending the Persona-Chat corpora (Dinan et al., 2019a) to six languages: Chinese, French, Indonesian, Italian, Korean, and Japanese.In XPersona, the training sets are automatically translated using translation APIs with several human-in-the-loop passes of mistake correction.In contrast, the validation and test sets Table 1: Multi-turn annotated dialogue samples from test set in seven languages.For simplicity, we only show three turns for each dialogue and the persona in English.
are annotated by human experts to facilitate both automatic and human evaluations in multiple languages.
Furthermore, we propose competitive baselines in two training settings, namely, cross-lingual and multilingual, and compare them with translation pipeline models.Our baselines leverage pretrained cross-lingual (Chi et al., 2019) and multilingual (Devlin et al., 2018) models.
An extensive automatic and human evaluation (Li et al., 2019) of our models shows that a multilingual system is able to outperform strong translation-based models and on par with or even improve the monolingual model.The cross-lingual performance is still lower than other models, which indicates that cross-lingual conversation modeling is very challenging.The main contribution of this paper are summarized as follows: • We present the first multilingual non-goaloriented dialogue benchmark for evaluating multilingual generative chatbots.
• We provide both cross-lingual and multilingual baselines and discuss their limitations to inspire future research.
• We show the potential of multilingual systems to understand the mixed language dialogue context and generate coherent responses.

Related Work
Dialogue Systems are categorized as goaloriented (Williams and Young, 2007;Young et al., 2013) and chit-chat (Serban et al., 2016;Vinyals and Le, 2015).Interested readers may refer to Gao et al. (2018) for a general overview.In this paper, we focus on the latter, for which, in recent years, several tasks and datasets have been proposed to ground the conversation on knowledge (Dinan et al., 2019b;Gopalakrishnan et al., 2019;Shuster et al., 2018;Fan et al., 2019;Reddy et al., 2019;Choi et al., 2018;Moon et al., 2019) such as Wiki-Articles, Reddit-Post, and CNN-Article.In this work, we focus on personalized dialogue agents where the dialogues are grounded on persona information.Li et al. (2016a) was the first to introduce a persona-grounded dialogue dataset for improving response consistency.Later on, Zhang et al. (2018) and Dinan et al. (2019a) introduced Persona-chat, a multi-turn conversational dataset, where two speakers are paired, and a persona description (4-5 sentences) is randomly assigned to each of them  conditioning the response generation on the persona descriptions, a chit-chat model is able to produce a more persona-consistent dialogue (Zhang et al., 2018).Several works have improved on the initial baselines with various methodologies (Kulikov et al., 2018;Yavuz et al., 2019;Hancock et al., 2019;Madotto et al., 2019;Joshi et al., 2017;Zemlyanskiy and Sha, 2018), especially using large pre-trained models (Wolf et al., 2019;Zhang et al., 2019).
Cross-lingual Cross-lingual adaptation learns the inter-connections among languages and circumvents the requirement of extensive training data in target languages (Wisniewski et al., 2014;Zhang et al., 2016;Liu et al., 2019b).Cross-lingual transfer learning methods have been applied to multiple NLP tasks, such as named entity recognition (Ni et al., 2017;Xie et al., 2018), natural language understanding (Liu et al., 2019c), dialogue state tracking (Chen et al., 2018), part-of-speech tagging (Wisniewski et al., 2014;Zhang et al., 2016;Kim et al., 2017), and dependency parsing (Ahmad et al., 2019;Schuster et al., 2019b).Meanwhile, Lample and Conneau (2019) and Conneau et al. (2019) proposed pre-trained cross-lingual language models to align multiple language representations, achieving state-of-the-art results in many crosslingual classification tasks.The aforementioned tasks focused on classification and sequence labeling, while instead, Chi et al. (2019) proposed to pretrain both the encoder and decoder of a sequence-tosequence model (XNLG) to conduct cross-lingual generation tasks, namely, question generation and abstractive summarization.The latter is the closest to our task since it focuses on language generation; however cross-lingual dialogue generation has not yet been explored.

Data Collection
The proposed XPersona dataset is an extension of the persona-chat dataset (Zhang et al., 2018;Dinan et al., 2019a).Specifically, we extend the ConvAI2 (Dinan et al., 2019a)  Compared to collecting new persona sentences and dialogues in each language, human-annotating the dialogues by leveraging translation APIs has multiple advantages.First, it increases the data distribution similarity across languages (Conneau et al., 2018), which can better examine the system's cross-lingual transferability.Second, revising the machine-translated dialogues based on the original English dialogue improves the data construction efficiency.Third, it leverages the well-constructed English persona conversations as a reference to ensure the dialogue quality without the need for training a new pool of workers to generate new samples (Conneau et al., 2018).
On the other hand, human-translating the entire training-set (∼130K utterances) in six languages is expensive.Therefore, we propose an iterative method to improve the quality of the automatically translated training set.We firstly sample 200 dialogues from the training set (∼2600 utterances) in each language, and we assign human annotators to list all frequent translation mistakes in the given dialogues.For example, daily colloquial English expressions such as "cool", "I see", and "lol" are usually literally translated.After that, we use a simple string matching to revise the inappropriate translations in the whole training-set and return a revision log, which records all the revised utterances.Then, we assign human annotators to check all the revised utterances and list translation mistakes again.We repeat this process at least twice for each language.Finally, we summarize the statistics of the collected dataset in Table 2.

Multilingual Personalized Conversational Models
Let us define a dialogue D = {U 1 , S 1 , U 2 , S 2 , . . ., U n , S n } as an alternating set of utterances from two speakers, where U and S represent the user and the system, respectively.Each speaker has its corresponding persona description that consists of a set of sentences P = {P 1 , . . ., P m }.Given the system persona sentences P s and dialogue history D t = {U 1 , S 1 , U 2 , . . ., S t−1 , U t }, we are interested in predicting the system utterances S t .

Model Architecture
We explore both encoder-decoder and causal decoder architectures, and we leverage existing pretrained contextualized multilingual language models as weights initialization.Hence, we firstly define the multilingual embedding layer and then the two multilingual models used in our experiments.
Embedding We define three embedding matrices: word embedding E W ∈ R |V |×d , positional embedding E P ∈ R M ×d , and segmentation embedding E S ∈ R |S|×d , where |.| denotes set cardinality, d is the embedding size, V denotes the vocabulary, M denotes the maximum sequence length, and S denotes the set of segmentation tokens.Segmentation embedding (Wolf et al., 2019) is used to indicate whether the current token is part of i) Persona sentences, ii) System (Sys.) utterances, iii) User utterances, iv) response in Language l id .The language embedding l id is used to inform the model which language to generate.Hence, given a sequence of tokens X, the embedding functions E are defined as: where ⊕ denotes the positional sum, X pos = {1, . . ., |X|} and X seg is the sequence of segmentation tokens, as in Wolf et al. (2019).Figure 1 shows a visual representation of the embedding process.A more detailed illustration is reported in Appendix B.
Encoder-Decoder To model the response generation, we use a Transformer (Vaswani et al., 2017) based encoder-decoder (Vinyals and Le, 2015).As illustrated in Figure 1, we concatenate4 the system persona P s with the dialogue history D t .Then we use the embedding layer E to finally pass it to the encoder.In short, we have: where H ∈ R L×d model is the hidden representation computed by the encoder, and L denotes the input sequence length.Then, the decoder attends to H and generates the system response S t token by token.In the decoder, segmentation embedding is the language ID embedding (e.g., we look up the embedding for Italian to decode Italian).Thus: Causal Decoder As an alternative to encoderdecoders, the causal-decoders (Radford et al., 2018(Radford et al., , 2019;;He et al., 2018) have been used to model conversational responses (Wolf et al., 2019;Zhang et al., 2019) by giving as a prefix the dialogue history.In our model, we concatenate the persona P s and the dialogue history D t as the language model prefix, and autoregressively decode the system response S t based on language embedding (i.e.l id ): Figure 1 shows the conceptual differences between the encoder-decoder and casual decoder.Note that in both multilingual models, the dialogue history encoding process is language-agnostic, while decoding language is controlled by the language embedding.Such design allows the model to understand mixed-language dialogue contexts and to responds in the desired language (details in Section 5.3.2).

Training Strategy
We consider two training strategies to learn a multilingual conversational model: multilingual training and cross-lingual training.
Multilingual Training jointly learns to perform personalized conversations in multiple languages.We follow a transfer learning approach (Wolf et al., 2019;See et al., 2019) by initializing our models with the weights of the large multilingual pretrained model M-Bert (Pires et al., 2019).For the causal decoder, we add the causal mask into self-attention layer to convert M-Bert encoder to decoder.For encoder-decoder model, we randomly initialize the cross encoder-decoder attention (Rothe et al., 2019).Then, we train the both models on the combined training set in all 7 languages using crossentropy loss.
Cross-lingual Training transfers knowledge from the source language data to the target languages.In this setting, the model is trained on English (source language) conversational samples, and evaluated on the other 6 languages.Following the methodology proposed by Chi et al. (2019), we align the embedded representations of different languages into the same embedding space by applying cross-lingual pre-training to the encoder-decoder model.The pre-training procedure consists of two stages: • pre-training the encoder and the decoder independently utilizing masked language modeling, as in Lample and Conneau (2019); • jointly pre-training the encoder-decoder by using two objective functions: Cross-Lingual Auto-Encoding (XAE) and Denoising Auto-Encoding (DAE) (Chi et al., 2019).
For instance, DAE adds perturbations to the input sentence of encoder and tries to reconstructs the original sentence using the decoder, whereas, XAE uses parallel translation data to pre-train both the encoder and decoder with machine translation objective.As in the multilingual models, the language IDs are fed into the decoder to control the language of generated sentences.Both pre-training stages require both parallel and non-parallel data in the target language.
After the two stages of pre-training, the model is fine-tuned using just the source language samples (i.e., English) with the same cross-entropy loss as for the multilingual training.However, as suggested in Chi et al. (2019), only the encoder parameters are updated with back-propagation and both the decoder and the word embedding layer remain frozen.This retains the decoders' ability to generate multilingual output while still being able to learn new tasks using only the target language.

Evaluation Metrics
Evaluating open-domain chit-chat models is challenging, especially in multiple languages and at the dialogue-level.Hence, we evaluate our models using both automatic and human evaluation.In both cases, human-annotated dialogues are used, which show the importance of the provided dataset.
Automatic For each language, we evaluate responses generated by the models using perplexity (ppl.) and BLEU (Papineni et al., 2002) with reference to the human-annotated responses.Although these automatic measures are not perfect (Liu et al., 2016), they help to roughly estimate the performance of different models under the same test set.More recently, Adiwardana et al. (2020) has shown the correlation between perplexity and human judgment in open-domain chit-chat models.
Human Asking humans to evaluate the quality of a dialogue model is challenging, especially when multiple models have to be compared.The likert score (a.k.a. 1 to 5 scoring) has been widely used to evaluate the interactive experience with conversational models (Venkatesh et al., 2018;See et al., 2019;Zhang et al., 2018;Dinan et al., 2019a).In such evaluation, a human interacts with the systems for several turns, and then they assign a score from 1 to 5 based on three questions (Zhang et al., 2018) about fluency, engagingness, and consistency.This evaluation is both expensive to conduct and requires many samples to achieve statistically significant results Li et al. (2019).To cope with these issues, Li et al. (2019) proposed ACUTE-EVAL, an A/B test evaluation for dialogue systems.The authors proposed two modes: human-model chats and self-chat (Li et al., 2016b;Ghandeharioun et al., 2019).In this work, we opt for the latter since it is cheaper to conduct and achieves similar results (Li et al., 2019) to the former.Another advantage of using this method is the ability to evaluate multi-turn conversations instead of single-turn responses.
Following ACUTE-EVAL, the annotator is provided with two full dialogues made by self-chat or human-dialogue.The annotator is asked to choose which of the two dialogues is better in terms of engagingness, interestingness, and humanness.For each comparison, we sample 60-100 conversations from both models.In Appendix C, we report the exact questions and instructions given to the annotators, and the user interface used in the evaluation.We hired native speakers annotators for all six con- sidered languages.The annotators were different from the dataset collection annotators to avoid any possible bias.

Implementation Details
Multilingual Models We use the "BERT-Base, Multilingual Cased" checkpoint, and we denote the multilingual encoder-decoder model as M-Bert2Bert (∼220M parameters) and causal decoder model as M-CausalBert (∼110M parameters).We fine-tune both models in the combined training set (English in Persona-chat (Zhang et al., 2018), six languages in Xpersona) for five epochs with AdamW5 optimizer and a learning rate of 6.25e-5.
Monolingual Models To verify whether the multilingual agent will under-perform the monolingual agent in the monolingual conversational task, we build a monolingual encoder-decoder model and causal decoder model for each language.For a fair comparison, we initialize the monolingual models with a pre-trained monolingual BERT6 (Devlin et al., 2018;Cui et al., 2019;Martin et al., 2019).
Then we fine-tune each model in each language independently for the same number of epoch and optimizer as the multilingual model.
Translation-based Models Another strong baseline we compare with is Poly-encoder (Humeau et al., 2019), a large-scale pre-trained retrieval model that has shown state-of-the-art performance in the English Persona-chat dataset (Li et al., 2019).
We adapt this model to the other languages by using the Google Translate API to translate target languages (e.g., Chinese) query to English as the input to the model, then translate the English response back to the target language.Thus, the response generation flow is: target query → English query → English response → target response.We denote this model as Poly.
Cross-lingual Models.In the first pre-training stage, we use the pre-trained weights from XLMRbase (Conneau et al., 2019).Then, we follow the second pre-training stage of XNLG (Chi et al., 2019) for pre-training Italian, Japanese, Korean, Indonesia cross-lingual transferable models.For Chinese and French, we directly apply the pretrained XNLG (Chi et al., 2019) weights7 .Then, the pre-trained models are fine-tune on English Per-sonaChat training set and early stop based on the perplexity on target language validation set.We hypothesize that this is because the one-tomany problem (Zhao et al., 2017) in open-domain conversation weakens the relation between encoder and decoder; thus the well pre-trained decoder (Bert) easily converges to a locally-optimal, and learns to ignore the dialogue context from the encoder and generate the response in an unconditional language model way.We leave the investigation of this problem to future work.On the other hand, M-CausalBert achieves a comparable or slightly better performance compared to CausalBert, which suggests that M-CausalBert leverages the data from other languages.As expected, we observe a significant gap between the cross-lingual model and other models, which indicates that cross-lingual zero-shot conversation modeling is very challenging.

Quantitative Analysis
Table 4 shows the human evaluation result of comparing M-CausalBert (Multi) against the human, translation-based Poly-encoder (Poly), and monolingual CausalBert (Mono).The results illustrate that Multi outperforms Mono in English and Chinese, and is on par with Mono in other languages.On the other hand, Poly shows a strong performance in English as it was pre-trained with a large-scale English conversation corpus.In contrast, the performance of Poly drops in other languages, which indicates that the imperfect translation affects translation-based systems.We also conduct M-CausalBert (Multi) against XNLG (cross) human evaluation, and Multi achieve nearly 100 percent winning rate.

Qualitative Analysis and Discussion
We randomly sample 7 self-chat dialogues for each baseline model in the seven languages and report them in Appendix D., And we summarize the generation of each model as follows: Poly Poly-encoder, pretrained on 174 million Reddit data, can accurately retrieve coherent and diverse responses in English.However, in the other six languages, some of the retrieved responses are digressive due to translation error.

Monolingual & Multilingual
We observe that both the monolingual and multilingual models can generate fluent responses.Compared to Bert2Bert and M-Bert2Bert, CausalBert and M-CausalBert can generate more on-topic responses but sometimes repeat through turns.CausalBert and M-CausalBert are on par with each other in monolingual conversational tasks, while M-CausalBert shows the advantage of handling a mixed-language context.For multilingual speakers, the conversation may involve multiple languages.Therefore, we experiment on M-CausalBert with two settings: 1) many-to-one, in which users converse with the

(I like to watch tv) Sys i really like hiking and listening to music
Table 5: Many-to-one: understand mixed-language dialogue context in multiple languages and generate response in one language model in 6 languages, and the model generate responses in English, 2) one-to-many, in which users converse with the model using English, and the model generates responses in 6 languages using language embedding and corresponding persona sentences.Table 5 and table 6 illustrate the generation examples under these settings (more examples reported in Appendix C.1).Most of the time, M-CausalBert can understand the mixed-language context, and decode coherent response in different languages.Understanding the mixed-language dialogue context is a desirable skill for end-to-end chit-chat systems, and a systematic study of this research question is needed in future.
Cross-lingual.The current state-of-the-art crosslingual generation approach XNLG (Chi et al., 2019) shows inferior performance on multi-turn dialogue tasks, and generates repetitive responses.Although cross-lingual dialogue generation is challenging, it reduces the human effort for data annotation in different languages.Therefore, the crosslanguage transfer is an important direction to investigate.
(Hello, I am a teacher.What are you studying?) Table 6: One-to-many: response one dialogue context with 7 different languages

Conclusion
In this paper, we studied both cross-lingual and multilingual approaches in end-to-end personalized dialogue modeling.We presented the XPersona dataset, a multilingual extension of Persona-Chat, for evaluating the multilingual personalized chatbots.We further provided both cross-lingual and multilingual baselines and compared them with the monolingual approach and two-stage translation approach.Extensive automatic evaluation and human evaluation were conducted to examine the models' performance.The experimental results showed that multilingual trained models, with a single model across multiple languages, can outperform the twostage translation approach and is on par with monolingual models.On the other hand, the current stateof-the-art cross-lingual approach XNLG achieved lower performance than other baselines.In future work, we plan to research a more advanced crosslingual generation approach and construct a mixedlanguage conversational benchmark for evaluating multilingual systems.

A.1 Annotation Instructions
In this section, we show the instructions for French annotation: • There are two existing columns of conversations: the first column (en) is the original conversations in English, the second column (fr) is the conversations translated by an automatic system (e.g., Google Translate).
• You should copy the conversation from the second column (the translated conversations) into the third column (named fr_annotation).
In that column, you should then revise the incorrect or inappropriate translations.
• The goal of the revision is to make the conversations more coherent and fluent in the target language (French).Hence you can customize dialogues and persona sentences to make them fluent and coherent in the target language, including by deviating from the original translation.However, you should retain persona and conversation consistency.

A.2 Training Set Statistics
We report our iterative revised training set statistics in Table 7.

B Model Detail
Figure 3 and 4 illustrates the details of the multilingual causal decoder and the multilingual encoderdecoder models.

C Human Evaluation
As illustrated in Figure 2, the annotator is provided with two full dialogues made by a self-chat model or human-dialogues.Then the annotators are asked the following questions: • Who would you talk to for a long conversation?
• If you had to say one of these speakers is interesting and one is boring, who would you say is more interesting?
• Which speaker sounds more human?

D Generated Samples D.1 Mixed-language Samples
We report more the mixed-language samples generated by M-CausalBert in

D.2 Model Comparison Samples
We randomly sample one self-chat dialogue examples for each model in each language and report them in figure 5-37.
(Hi, I enjoy spending time with my family.)other than my saturday routines on gettin my nails done , i love hiking nice !i am getting ready for a big move in a couple months , going to california .i love cali , was just there a month ago with my family in san diego ?i am going to be moving there in a few months .yes !i have been there a few times , nice place it really is !san diego is expensive , but it looks great .
it is little expensive for some , but still cheaper than in some major cities      my car is a lot of dreamers , ma i suoi film si acquisto spesso scarpe e orologi .attualmente lavoro per un rivenditore di orologi .
i have a children who frequenta la scuola media .i film horror sono preferiti , but my father worked as a deposito .i suoi film sono in casa papa .i live in un tempo e spero di diventare un avvocato .i 'm a children who frequenta la scuola media .i film horror sono i suoi film preferiti .
i 'm gonna have a lot of children , but i suoi film si e ' stato un fantasma di scarpe e orologi .attualmente lavoro per un rivenditore di orologi .attualmente la famiglia is not supporta le mie scelte di carriera .i film sono a children who frequenta la scuola media .i film horror sono i suoi film preferiti .i 've been a casa of a children , i suoi film sono preferiti .my father worked in casa a casa .i suoi film sono in casa di papi .Je ne sais pas.j'aime les chiens, mais je ne les ai pas, haha! je n'ai pas encore été touché par l'ouragan iya nih.je suis végétalien donc je mange beaucoup de viande.la musique country est cool.aimez -vous écouter de la musique?
i'm en train de regarder les yeux verts, j'adore le vert c'est super.je suis un étudiant à temps plein stephen king est un grand auteur.j'ai un cheval, il est mon meilleur ami.iya nih.je suis en train de me réveiller et de me détend         nama kucing saya mater, dia kucing!oh keren kamu punya hewan peliharaan!apakah kamu suka petir mcqueen?saya lakukan.saya belum melihat yang satu itu.seperti apa petir mcqueen?kucing saya bernama mater.setelah kaki mater dari mcqueen petir!film favorit saya kucing saya bernama mater.setelah kaki mater dari mcqueen petir!film favorit saya nama kucing saya mater, dia kucing!saya memiliki hasrat untuk membaca petualangan ruang sci fi.apakah anda suka membaca ?membaca itu bagus.saya juga suka menonton bola basket untuk bersantai.apakah kamu menyukai olahraga ?di waktu luang i pernah melakukan pekerjaan sukarela .i am a lot to have a sukarawat .i have a lot of sukarang and i have been doing a sukaren with my own friends .i love to live at the night .i have been working in a sukarela at night to pay for a sukarela .di waktu luang i pernah melakukan pekerjaan sukarela .i have been menikmati di sekitar orang sukarela .i have been a lot of sukarela .i have been in the place to have a lot of sukarela and i have been a sukarela at the night .

Table 2 :
The statistics of the collected dataset.We report the number of dialogues (#Dial.)and utterances (#Utt.) of the validation and test set in six languages.Edit distance per dialogue (Edit) and BLEU score are computed to show the difference between the human-annotated dataset and auto-translated dataset.(Training set is reported in Appendix A)

Table 3 :
Results of automatic evaluation score on test set in seven languages.We compute the BLEU score and perplexity (ppl.) for monolingual, multilingual, and cross-lingual models.

Table 4 :
Results of ACUTE-EVAL human evaluation.Tests are conducted pairwise between M-CausalBert (Multi.)and other models (Human, Poly-encoder (Poly), Monolingual CausalBert (Mono)).Numbers indicate the winning rate of Multi.Numbers in bold are statistically significant (p < 0.05).

Table 8 :
One-to-many by M-CausalBert

Table 9 :
Many-to-one by M-CausalBert i have a car that i like to use.do you have any hobbies?