Topic-Guided Self-Introduction Generation for Social Media Users

Millions of users are active on social media. To allow users to better showcase themselves and network with others, we explore the auto-generation of social media self-introduction, a short sentence outlining a user's personal interests. While most prior work profiles users with tags (e.g., ages), we investigate sentence-level self-introductions to provide a more natural and engaging way for users to know each other. Here we exploit a user's tweeting history to generate their self-introduction. The task is non-trivial because the history content may be lengthy, noisy, and exhibit various personal interests. To address this challenge, we propose a novel unified topic-guided encoder-decoder (UTGED) framework; it models latent topics to reflect salient user interest, whose topic mixture then guides encoding a user's history and topic words control decoding their self-introduction. For experiments, we collect a large-scale Twitter dataset, and extensive results show the superiority of our UTGED to the advanced encoder-decoder models without topic modeling.


Introduction
The irresistible popularity of social media results in an explosive number of users, creating and broadcasting massive amounts of content every day.Although it exhibits rich resources for users to build connections and share content, the sheer quantities of users might hinder one from finding those they want to follow (Matikainen, 2015).To enable users to quickly know each other, many social platforms encourage a user to write a self-introduction, a sentence to overview their personal interests.
A self-introduction is part of a self-described profile, which may else include locations, selfies, user tags, and so forth, and is crucial in online user Invertebrate Paleontologist and Collection Manager at the Delaware Museum of Natural History.

Self-introduction: User previously published tweets (user history):
How Delaware are you?New book on the 'secret' First State may stump you httpurl (Delaware) Duck! Octopuses caught on camera throwing things at each other (invertebrates) Rare fossil #clam discovered alive httpurl (paleontology) 'A labor of love' | Revamped Delaware Museum of Nature and Science opens its doors to the public again (Delaware; museum) Delaware's close to naming an official state dinosaur!(Delaware; paleontology) She's back: Museum of Nature and Science sets reopening events (museum) Rafinesque, Ready for a Close-Up httpurl (others) Researchers have unlocked the secret to pearls' incredible symmetry (invertebrates) New Jersey is a strange beautiful place.httpurl (others)

invertebrates, paleontology, museum and others
Figure 1: Twitter user U with a self-introduction on the top, followed by the previous tweets (user history).U exhibits a mixture of personal interests in Delaware, invertebrates, paleontology, museum, and others.
interactions (McCay-Peet and Quan-Haase, 2016).Previous findings (Hutto et al., 2013) indicate users tend to follow those displaying self-introductions because a well-written self-introduction will brief others about a user's interests and facilitate them to initialize connections.It would benefit users in making like-minded friends and gaining popularity; whereas not all users are skillful in writing a good self-introduction.We are thus interested in how NLP may help and study self-introduction generation, a new application to learn user interests from their historical tweets (henceforth user history) and brief them in a self-introduction.
Despite substantial efforts made in profiling users, most existing work (Li et al., 2014;Farseev et al., 2015;Farnadi et al., 2018;Chen et al., 2019b) focuses on extracting keywords from user history and producing tag-level user attributes (e.g., interests, ages, and personality), which may later charac-terize personalization and recommendation (Wang et al., 2019a;Liang et al., 2022).However, taglevel attributes profile a user through a fragmented view, while human readers may find it difficult to read.On the contrary, we automate the writing of a sentence-level self-introduction via language generation, providing a more natural and easy-tounderstand way to warm up social interactions.It consequently will enable a better socializing experience and user engagement in social media.
To practically train NLP models with capabilities in self-introduction writing, we collect a large-scale Twitter dataset with 170K public users.Each user presents a self-introduction (manually written by themselves) and previous tweets in their history, corresponding to a total of 10.2M tweets.
For methodology design, we take advantage of cutting-edge practices using pre-trained encoderdecoder for language understanding and generation.However, in real-world practice, users may post numerous tweets exhibiting lengthy content, noisy writings, and diverse interests; these may challenge existing encoder-decoder models in capturing salient personal interests and reflecting them in the brief self-introduction writing.
To illustrate this challenge, Figure 1 shows the self-introduction of a Twitter user U and some sampled tweets from U 's user history.U exhibits a mixture of interests varying in Delaware, invertebrates, paleontology, museum, and others, scatteredly indicated in multiple noisy tweets.It presents a concrete challenge for models to digest the fragmented information, distill the introduction-worthy points, and condense them into a concise, coherent, and engaging self-introduction for further interactions.Moreover, existing NLP models are ineffective in encoding very long documents (Cao and Wang, 2022), whereas popular users may post numerous tweets, resulting in a lengthy history to encode.
Consequently, we propose a novel unified topicguided encoder-decoder (UTGED) framework for self-introduction generation.First, a neural topic model (Srivastava and Sutton, 2017) clusters words by statistics to learn a mixture of latent topics in characterizing user interests underlying their lengthy history.Then, we inject the latent topics into a BART-based encoder and decoder (Lewis et al., 2020); the encoder employs topic distributions as continuous prompts (Lester et al., 2021;Liu et al., 2021;Li and Liang, 2021) to guide capturing personal interest mixture, and the decoder adopts topic words to control the writing for personalized self-introduction.
In experimental results, the comparison in both automatic and human evaluation show that UTGED outperforms state-of-the-art encoder-decoder models without topic guidance; and ablation studies indicate the individual contribution from topicguided encoder and decoder.Then, we conduct parameter analyses on topic number and topic prompt length; they are followed by the study on model performance given users varying in historical tweet number, where UTGED consistently performs better.Finally, a case study and an error analysis interpret UTGED's superiority and limitations.
To the best of our knowledge, we present the first NLP study on self-introduction writing from user tweeting history, where we build the first dataset for its empirical studies and show the benefits from latent topics to the state-of-the-art encoder-decoder paradigm.Below are details of our contributions.
• We present a new application to capture personal interests from a user's tweeting history and generate their self-introductions accordingly.
• We approach the application with a novel UTGED (unified topic-guided encoder-decoder) framework, which explores latent topics to represent users' personal interests and to jointly guide user encoding and self-introduction decoding.
• We construct a large-scale Twitter dataset for self-introduction study and extensive experimental results on it show the superiority of UTGED practically and the benefits of latent topics on the task.

Related Work
Our work relates to user profiling (by task formulation) and topic modeling (by methodology).
User Profiling.This task aims to characterize user attributes to reflect a personal view.Most previous work focuses on modeling a user's tweeting history (Li et al., 2014) and social network interactions (Qian et al., 2019;Wang et al., 2019a;Chen et al., 2019b;Wang et al., 2021;Wei et al., 2022) to predict user attribute tags (e.g., ages and interests).However, most existing work focuses on classifying user profiles into fragmented and limited tags.Different from them, we study sentence-level selfintroduction and explore how NLP handles such personalized generation, which initializes the potential to profile a user via self-introduction writing.(Srivastava and Sutton, 2017).
Latent topics have shown beneficial to many NLP writing applications, such as the language generation for dialogue summaries (Zhang et al., 2021), dialogue responses (Zhao et al., 2017(Zhao et al., , 2018;;Chan et al., 2021;Wang et al., 2022), poetries (Chen et al., 2019a;Yi et al., 2020), social media keyphrases (Wang et al., 2019b), quotations (Wang et al., 2020), and stories (Hu et al., 2022).Most existing methods focus on exploiting topics in decoding and injecting latent topic vectors (topic mixture) to assist generation.In contrast to the above work's scenarios, our application requires digesting much more lengthy and noisy inputs with scattered keypoints; thus, we leverage topics more finely and enable its joint guidance in encoding (by feeding in the topic mixture as topic prompts) and decoding (using topic words to control word-byword generation).
Inspired by the success of pre-trained language models (PLMs), some efforts have been made to incorporate PLMs into VAE to conduct topic modeling (Li et al., 2020;Gupta et al., 2021;Meng et al., 2022).However, PLMs might be suboptimal in modeling user history (formed by numerous noisy tweets), because PLMs tend to be limited in encoding very long documents (Cao and Wang, 2022).Here, we model latent topics by word statistics, allowing better potential to encode long input.

Twitter Self-Introduction Dataset
To set up empirical studies for social media selfintroduction, we build a large-scale Twitter dataset.from September 2018 to September 2019.Then, we extracted the user ids therein and removed the duplicated ones.Next, we gathered users' tweeting history and self-introductions via Twitter API2 and filtered out inactive users with less than 30 published tweets.For users with over 100 published tweets, only the latest 100 ones were kept.
At last, we maintained the tweet text in English and removed irrelevant fields, e.g., images and videos.
Data Pre-processing.First, we removed non-English self-introductions and those too short (<7 tokens) or too long (>30 tokens).Second, we employed SimCSE (Gao et al., 2021) (an advanced model for semantic matching) to measure the text similarity between a user's self-introduction and their tweeting history.Then, for training quality concern, we removed users with self-introductions that exhibit less than 0.4 similarity score3 on average to the top-30 tweets in history. 4Third, for the remaining 176,199 unique user samples, each corresponds to a pair of user history (source) and self-introduction (target).For model evaluation, we randomly split the user samples into training (80%), validation (10%), and test (10%) sets.

NLL loss
Word distribution (Old & New) Data Analysis.Encoder-decoder models are widely used in summarization tasks ( §2).We then discuss the difference of our task through an empirical lens.The statistics of our dataset and other popular summarization datasets are compared in Table 1.We observe that each of our data sample exhibits a longer source text and a shorter target text compared to other datasets.It indicates the challenge of our self-introduction task, where directly using summarization models may be ineffective.
To further analyze the challenges, Figure 2(a) displays the distribution of SimCSE-measured source-target similarity (averaged over top-30 tweets in user history).It implies that very few tweets are semantically similar to their authors' self-introductions, making it insufficient to simply "copy" from history tweets.We then analyze and show the tweet number distribution in user history in Figure 2(b).It is noticed that 37% users posted over 90 history tweets, scattering interest points in numerous tweets and hindering models in capturing the essential ones to write a self-introduction.

Our UTGED Framework
Here we describe our UTGED (unified topicguided encoder-decoder) framework.Its overview is in Figure 3: latent topics guide the PLMs to encode user history and decode self-introductions.
The data is formulated as source-target pairs , where X i = {x i 1 , x i 2 , ..., x i m } indicates user history with m tweets published by user u i , Y i represents the user-written description, and N is the number of pairs.In our task, for user u i , models are fed in their user history tweets X i and trained to generate their self-introduction Y i .

Neural Topic Model
To explore users' interests hidden in their numerous and noisy tweets, we employ a neural topic model (NTM) (Srivastava and Sutton, 2017) to learn latent topics (word clusters).NTM is based on VAE with an encoder and a decoder to reconstruct the input.
For word statistic modeling, the history tweets in X i are first processed to a one-hot vector X i bow ∈ R V bow in Bag-of-words (BoW), where V bow indicates NTM's vocabulary size.Then, similar to VAE, NTM encoder transforms BoW vector Here presents more details.NTM Encoder.Given the BoW vector X i bow , NTM encoder attempts to learn the mean µ i and standard deviation σ i based on the assumption that words in X i exhibit a Gaussian prior distribution.Its mean and standard deviation, µ i and σ i will be encoded by the following formula and later be utilized to compute the latent topic vector z i : where f * (•) indicates a single layer perceptron performing the linear transformation of input vectors.
NTM Decoder.We then reconstruct the BoW in X i based on the NTM-encoded µ i and σ i .We hypothesize that a corpus may exist K latent topics, each reflecting a certain user interest and represented by word distribution over the vocabulary V bow .Besides, user history X i is represented as a topic mixture θ i to reflect u i 's interest combination over K topics.The procedure is as follows: where f θ and f ϕ are a single layer perceptron.The weight matrix of f ϕ indicates topic-word distributions (ϕ 1 ,ϕ 2 ,...,ϕ K ).
The learned latent topics for X i will later guide the BART-based self-introduction generation (to be discussed in §4.2).The topic mixture θ i will be injected into the BART encoder for capturing salient interests and the top-l words A i = {a i 1 , a i 2 , ..., a i l } with highest topic-word probability in ϕ c (c indexes the major topic suggested by θ i ) will go for controlling the writing process of the BART decoder.

Topic-Guided Generation Model
We have discussed how to model a user u i 's latent interests with NTM and the learned latent topics (θ i and topic words A i ) will then guide a BARTbased encoder-decoder model to generate u i 's selfintroduction, Y i .In the following, we first present how we select tweets (for fitting overly long user history into a transformer encoder), followed by our topic-guided design for encoding and decoding.
Tweet Selection.Recall from §3 that user history tends to be very long (Table 1 shows it has 1581.3tokens on average).However, BART encoder limits its input length.To fit in the input, we go through the following steps to shortlist representative tweets from a user u i 's lengthy tweeting history, X i .
First, we measure how well a tweet x i u can represent X i via averaging its similarity to all others: where Sim( SimCSE-measured cosine similarity.Then, we maintain a shortlist R i to hold X i 's representative tweets, which is empty at the beginning and iteratively added with x i h obtaining the highest similarity score (Eq.2).To mitigate redundancy in R i , once x i h is put in R i , it is removed from X i , and so are other tweets in X i whose cosine similarity to x i h is over a threshold λ (i.e., 0.8).For easy reading, we summarize the above steps in Algorithm 1.
After that, we further rank the shortlisted tweets in R i based on their overall similarity in X i (Eq.2).The top ones are maintained and concatenated chronologically to form a word sequence R i = {w i 1 , w i 2 , ..., w i M } (M denotes the word number).Topic Prompt Enhanced Encoder (TPEE).We then discuss how we encode R i (selected user history tweets) in guidance of the topic mixture θ i (featuring latent user interests).The encoding adopts the BART encoder and is trained with θ i -based prompt fine-tuning (thereby named as TPEE, short for topic prompt enhanced encoder).
We first obtain the topic prompt as follows: where MLP is a feedforward neural network.Following Li and Liang (2021), L indicates the topic prompt length and each vector b i j ∈ R d .To inject the guidance from topic prompts {b i 1 , b i 2 , ..., b i L } (carrying latent topic features), we put them side by side with the embeddings of words {w i 1 , w i 2 , ..., w i M } (reflecting word semantics of R i ).Then, a BART encoder E represents user u i 's salient interests H i E in its last layer: where e i j ∈ R d is the BART-encoded word embedding of w i j and [; ] is the concatenation operation.Topic Words Enhanced Decoder (TWED).Recall in §4.1, NTM generates l topic words (A i ) to depict a user u i 's major latent interests.To further reflect such interests in the produced selfintroduction, we employ A i to control a BART decoder D in its word-by-word generation process through the topic control module.
For easy understanding, we first describe how the original BART decode.At the t-th step, the decoder D is fed in its previous hidden states H i D,t , the BART encoder's hidden states H i E (Eq.4), and latest generated word Y i t , resulting in hidden step o i t+1 .Based on that, the next word is generated following the token distribution p i t+1 .The concrete workflow is shown in the formula as follows: where H i D,t+1 stores all the previous decoder hidden states till step t + 1, W e is learnable and to map the latent logit vector o i t+1 to the target vocabulary.Then, we engage topic words A i to control the above procedure by the topic control module.Inspired by BoW attribute model (Dathathri et al., 2020), we calculate the following log-likelihood loss to weigh the word generation probability p i t+1 over each topic word a i j ∈ A i : The gradient from log p(A i |Y i t+1 ) is further involved in updating all decoder layers (H i D,t ) in D: where H i D,t indicates the updated (topic-controlled) decoder's states, ∆H i D,t means the gradient update to H i D,t , α is the step size, and γ is the normalization value.Furthermore, we adopt the same topic-controlling strategy to update the encoder's final layer states H i E and derive the updated states H i E based on Eq. 8 and 9.With Eq. 5 and 6, we can 11391 accordingly obtain the final token distribution p i t+1 based on the topic-controlled encoder and decoder states H i E , H i D,t , and previous predicted word Y i t .

Joint Training in a Unified Framework
To couple the effects of NTM (described in §4.1) and topic-guided encoder-decoder module for selfintroduction generation (henceforth SIG discussed in §4.2), we explore the two modules in a unified framework and jointly train them for better collaborations.The loss function of the unified framework is hence a weighted sum of NTM and SIG: where L N T M and L SIG are the loss functions of NTM and SIG.α is the hyper-parameter trading off their effects and is set to 0.01 in our experiments.
For NTM, the learning objective is computed as: where D KL (•) indicates the Kullback-Leibler divergence loss and E[•] is reconstruction loss. 5For the SIG, it is trained with the cross-entropy loss: In practice, we first train the unified framework with Eq.10 and exclude A i (topic words output of NTM).Then, during inference, we fix UTGED, employ A i to control the decoding process and generate the final self-introduction with Eq.7~Eq.9.

Experimental Setup
Model Settings.We implemented NTM ( §4.1) based on (Srivastava and Sutton, 2017) and set its topic number K to 100.Its BoW vocabulary size V bow is set to 10K and hidden size to 200.The input of NTM is the BoW of original user history X i while the input of SIG is capped at 1,024 tokens based on the shortlisted tweets in R i ( §4.2). 6The SIG model is based on the BART and built on 6 encoding layers and 6 decoding layers.We adopted AdamW and SGD to optimize the SIG and NTM, respectively.The learning rate is set to 5 × 10 −5 for SIG and 1 × 10 −4 for NTM.The topic prompt length L is set to 7. To warm up joint training (Eq.10), we pre-train NTM with Eq.11 for 100 epochs.During joint training, batch size is set to 8 and the maximum epoch to 5. In topiccontrolled decoding, α is set to 0.25 and γ to 1.5 (Eq.9).Topic word number l is set to 30.Models are trained on a 24GB NVIDIA RTX3090 GPU.
Evaluation Metrics.For automatic evaluation, we adopt ROUGE-1 (R-1), ROUGE-2 (R-2), and ROUGE-L (R-L), which are popular metrics in language generation based on output-reference word overlap and originally for summarization tasks (Lin, 2004).We also conduct a human evaluation on a 5 point Likert scale and over three criteria: fluency of the generated language, consistency of a self-introduction to the user's history, and informativeness of it to reflect essential user interests.
In addition, we examine the upper-bound tweet selection (shortlist given reference selfintroduction).Here SimCSE first measures the similarity between the reference and each tweet in user history X i .Oracle E then extracts the tweet with the highest similarity score.For Oracle A , we rank tweets based on the similarity score and the top ones are fed into BART for a generation.Furthermore, to explore the potential of our topic-guided design over Oracle A model, we feed Oracle A 's input to our UTGED and name it Oracle A +Topic.

Main Comparison Results
Table 2 shows the main comparison results.We first observe the inferior results from all extractive models, including Oracle E .It is because of the non-trivial content gap between users' history tweets and their self-introductions (also indicated in Figure 2).Directly extracting tweets from user history is thus infeasible to depict self-introduction, presenting the need to involve language generation.For this reason, abstractive methods exhibit much better performance than extractive baselines.
Among comparisons in abstractive models, UTGED yields the best ROUGE scores and significantly outperforms the previous state-of-the-art summarization models.It shows the effectiveness in engaging guidance of latent topics from lengthy and noisy user history, which may usefully signal the salient interests for writing a self-introduction.
In addition, by comparing Oracle A results and model results, we observe a large margin in between.It suggests the challenge and importance of tweet selection for user history encoding, providing insight into future related work.Moreover, interestingly, Oracle A +Topic further outperforms Oracle A , implying topic-guided design would likewise benefit the upper-bound tweet selection scenarios.
Ablation Study.Here we probe into how UTGED's different modules work and show the ablation study results in Table 3.All modules (tweet selection (S), TPEE (E), and TWED (D)) contrite positively because they are all designed to guide models in focusing on essential content reflecting  Human Evaluation.To further test how useful our output is to human readers, we randomly select 100 samples from test set and train 3 in-house annotators from NLP background to rate the generated self-introductions.As shown in Table 4, UTGED is superior in informativeness and consistency.It implies latent topics can usefully help capture salient interests from lengthy and noisy user history.However, its fluency is lower than that of BART, indicating that topic words slightly perturb the pre-trained decoder (Dathathri et al., 2020).

Quantitative Analysis
To better study UTGED, we then quantify the topic number, prompt length, and input tweet number to examine how they affect performance.Here only R-L is shown for better display, and similar trends were observed from R-1 and R-2.For the full results, we refer readers to Appendix A.3.
Varying Topic Number.The first parameter analysis concerns the topic number K (NTM's hyperparameter).As shown in Figure 4(a), the score first increases then decreases with larger K and peaks the results at K = 100.We also observe K = 200 results in much worse performance than other Ks, probably because modeling too fine-grained topics is likely to overfit NTM in user interest modeling, further hindering self-introduction generation.Varying Prompt length.Likewise, we analyze the effects of prompt length L in Figure 4(b).The best score is observed given L=7, much better than very short or very long prompt length.Longer prompts may allow stronger hints from NTM, helpful to some extent; however, if the hint becomes too strong (given too-long prompt), topic features may overwhelm the encoder in learning specific features for self-introduction writing.
Users w/ Varying Tweet Number.Recall in Figure 2(b), users largely vary tweet number in history (attributed to different active degrees).We then examine how models work given varying tweet numbers in history.BART+S and UTGED are tested, both with tweet selection (S) to allow very long input and Figure 5 shows the results.Both models exhibit growing trends for more active users, benefiting from richer content in their history to infer self-introduction.Comparing the two models, UTGED performs consistently better, showing the gain from NTM's is robust over varying users.

Qualitative Analysis
Case Study. Figure 6 shows a user sample interested in "teaching" and "reading".It can be indicated by topic words like "student", "book", and "school" produced by NTM.From BART's output, we find its errors in "seesaw specialist" further mislead the model in writing more irrelevant content (e.g., "google certified educator" and "google trainer").It may be caused by the common exposure bias problem in language generation (Ranzato et al., 2016;Zhang et al., 2019).On the contrary, UTGED's output is on-topic till the end, showing topic guidance may mitigate off-topic writing. 7 Error Analysis.In the main comparison (Table 2), UTGED performs the best in yet also has a 7 More topic word cases could be found in Appendix A.4 and longer source tweets are shown in Appendix A.5.
Source: "someone is proud of her artwork now on display in our library!","we were excited to hear from to learn more about summer reading!", "second graders are becoming familiar with the intricacies of tinytap on our ipads as we prepare for an assured learning experience on folktales", "our makerspace is on the move!" BART: webersen elementary media specialist, seesaw specialist, google certified educator, google trainer, apple certified educator.UTGED: elementary library media specialist at webster hill elementary school.i love to connect with my students and help them grow as independent learners.Target: i proudly teach all pk-5 webster hill students.we learn to think critically, research efficiently, meaningfully integrate technology, and find joy in reading.
Topic words: life, love, learning, school, writing, book, read, yoga, kids, students, education, quotes, community, children, time From top to down shows user history (source T i ), major topic words (A i ), BART output, UTGED output, and reference self-introduction (target Y i ).We inspect the topic words helpful for our task and color them in red.
G: travel with mei is a travel blog with travel tips, deals, deals and more.T: travel in holiday is a blog that aims to inspire more people that there are more life and adventure to discover in this world.G: we are a group of pet lovers who love dogs and cats and want to share them with you!T: we put your pets on your pants!available for adults and kids makes perfect birthday and holiday gifts leggings and tops

Grammar Error
Topic Error non-trivial gap to Orcation A .Here we probe its limitations and discuss the two major error types in Figure 7. First, the output may contain grammatical mistakes, e.g., "deals, deals", limited by BART'S decoder capability and topic words' effects.It calls for involving grammar checking in decoding.The second error type is propagated from wrong latent topics.As shown in the error case (second row), the user is a provider of "pet"-style clothes, whereas NTM may cluster it with other "pet lover"-users and further mislead the writing process.Future work may explore a better topic modeling method to mitigate the effects of mistakenly clustering.
Additionally, we tested a sample of 10,000 users with similarity scores fall in the ranges of [0.3,0.4),[0.2,0.3),[0.1, 0.2), [0,0.1) and results of the best model Oracle A +Topic on low-similarity data samples are shown in Table 5.The results indicate that low-similarity data samples do impact negatively on the training results.

A.3 Full Experimental Results
Varying Topic Number.We show the results from BART+S+E on the left of "/" and those from UTGED on the right.Varying Prompt length.We show the results from BART+S+E on the left of "/" and those from UTGED on the right.training, golf, health, fitness, yoga, back, today, day, life, healthy, sports, club, monday, time, dealer, great, body, week, gym, fit, workout, run, team, motivation, free, fun, weight, stay, weekend, start

A.5 Detailed Case Study
Source: "someone is proud of her artwork now on display in our library!","fifth graders can't wait to read this summer!thanks for reaching out to our kids virtually!","we were excited to hear from to learn more about summer reading!", "im grateful i spent today in a school with students and teachers talking about story, compassion, and our hearts.thank you!", "it was an incredible day at webster hill with! thank you for sharing your energy, enthusiasm, and love of reading with our students!","second graders are becoming familiar with the intricacies of tinytap on our ipads as we prepare for an assured learning experience on folktales!","our makerspace is on the move!", "second graders are taking brief notes using information from pebblego and creating an expert ebook with the book creator app!", "kindergarten friends are browsing for informational texts and previewing the pictures to help them determine the main topic or what it is mostly about.we're practicing some seesaw skills to share our learning, too!", "third graders are becoming independent s of our library!here, they're noticing patters with call numbers to collaboratively organize e books.we want them to be able search for and locate books on any topic or area of interest!well on our way.","computer science truly connects to all content areas.here, a student is modifying musical notes and tempo to get a keyboard to play a popular song!", "we had an exciting morning at webster hill! it was such a pleasure to welcome and other special guests to a fourth grade library media class on coding.","more ozobot fun!", "getting to know dot and dash!", "programming ozobot to read color patters!", "pre k has been practicing following specific directions like a robot!we had lots of fun with a red light, green light song!", "after browsing for books, pre k friends engage in some fun centers that encourage cooperation.we're even starting to recognize some letters!", "mrs.bender and i have been spending lots of time making our library extra special for our amazing students!we are so excited to see everyone !", "fifth graders are starting to meet their middle school library media specialist!", "coding with cubetto!", "some research inspired by a true story!", "i was so excited to participate in a virtual author visit with our very own poet lms, jill dailey.amazing.","this year, i 'm getting to spend some time in classrooms working with students in small groups to apply their knowledge of informational texts.so much fun!", "primary students enjoyed reading neither this week with a message of acceptance.we used our love of the character to then spark some creativity and research!we designed new creatures from two animals using seesaw and then began exploring pebblego for facts.","browsing for good books!","supporting our budding early emergent readers with a repetitive text, familiar song, and some fun connections with drawing tools in seesaw!", "first graders can identify common text features and how they help readers!","fifth graders presented their website evaluations, citing evidence from the text and indicators of a reliable source to explain whether or not to use a site for research!","in kindergarten, we are making connections to our own lives with the characters and settings in stories!", "second graders are identifying information that is safe to share online and showing us what they know with a seesaw activity!","first graders are using strategies to recount the most important details in literature.here, we illustrated some of what we thought the author could n 't leave out! we even got to practice with our digital learning platform, seesaw.","library media lessons take place in the classroom this year!","we're back!our kindergarten friends learned about seesaw this week and began using drawing, photo, and audio recording tools to complete activities.we are digital learners!","the men and women's soccer teams shared their love of reading with webster hill!", "officer cogle and mr.k shared a story and an important message of supporting one another for our first ever live, virtual, whole school read aloud using google meet!", "state of connecticut superior court judge and webster hill alumnus!susan quinn cobb shared a story, gave background on her job, and took questions from our students." BART: webersen elementary media specialist, seesaw specialist, google certified educator, google trainer, apple certified educator.
UTGED: elementary library media specialist at webster hill elementary school.i love to connect with my students and help them grow as independent learners.
Target: i proudly teach all pk-5 webster hill students.we learn to think critically, research efficiently, meaningfully integrate technology, and find joy in reading.
Topic words : life, love, learning, school, writing, book, read, yoga, kids, students, education, quotes, community, children, time, reading, learn, math, books, autism, world, chat, quote, story, change, motivation, writers, people, things, english Figure 9: A Twitter user sample and the related results.From top to down shows user history (source T i ), topic words (A i ), BART output, UTGED output, and reference self-introduction (target Y i ).The source text consists of 70 tweets, and here we randomly sample half of them to put in the figure for a better display.

Figure 2 :
Figure 2: Analysis of the distribution over (a) average similarity of user history tweets (capped at top 30) to self-introduction and (b) tweet number in user history.

Figure 3 :
Figure 3: The overview of our UTGED (Unified Topic-Guided Encoder-Decoder) framework, The left module shows a neural topic model (NTM) representing user interests with latent topics.The topic mixtures help the encoder explore user history (middle) and topic words guide the decoder in self-introduction generation (right).

Figure 4 :
Figure 4: Parameter analysis results.The X-axis shows topic number (a) and prompt length (b); Y-axix shows the R-L score measured on our UTGED's output.userinterests against lengthy input.TPEE may show larger individual gain than the other two, possibly because the topic mixtures directly reflect user interests and are easier for the model to leverage.

Figure 5 :
Figure 5: The comparisons between BART+S and UTGED while varing sentence number.X-axis: the values of sentence number; Y-axis: the R-L score.

Figure 6 :
Figure 6: A Twitter user sample and the related results.From top to down shows user history (source T i ), major topic words (A i ), BART output, UTGED output, and reference self-introduction (target Y i ).We inspect the topic words helpful for our task and color them in red.

Figure 7 :
Figure 7: Examples of major error types for the generation results of UTGED (G) and the target reference (T).

Table 2 :
Main comparison results.UTGED achieves the best results (highlighted) and the performance gain is significant to all comparison models (indicated by * and measured by paired t-test with p-value<0.05).

Table 4 :
Human evaluation results.Cohen's Kappa for all annotator pairs is 0.63 on average (good agreement).

Table 5 :
The results of Oracle A +Topic on low-similarity data samples.

Table 6 :
The effects of topic number K.