Federated Continual Learning for Text Classification via Selective Inter-client Transfer

In this work, we combine the two paradigms: Federated Learning (FL) and Continual Learning (CL) for text classification task in cloud-edge continuum. The objective of Federated Continual Learning (FCL) is to improve deep learning models over life time at each client by (relevant and efficient) knowledge transfer without sharing data. Here, we address challenges in minimizing inter-client interference while knowledge sharing due to heterogeneous tasks across clients in FCL setup. In doing so, we propose a novel framework, Federated Selective Inter-client Transfer (FedSeIT) which selectively combines model parameters of foreign clients. To further maximize knowledge transfer, we assess domain overlap and select informative tasks from the sequence of historical tasks at each foreign client while preserving privacy. Evaluating against the baselines, we show improved performance, a gain of (average) 12.4\% in text classification over a sequence of tasks using five datasets from diverse domains. To the best of our knowledge, this is the first work that applies FCL to NLP.


Introduction
Federated Learning (Yurochkin et al., 2019;Li et al., 2020;Zhang et al., 2020;Karimireddy et al., 2020;Caldas et al., 2018) in Edge Computing 1 (Wang et al., 2019) has gain attraction in recent years due to (a) data privacy and sovereignty-especially imposed by government regulations (GDPR, CCPA etc.), and (b) the need for sharing knowledge across edge (client) devices such as mobile phones, automobiles, wearable gadgets, etc. while maintaining data localization.Federated Learning (FL) is a privacy-preserving machine learning (ML) technique that enables collaborative training of ML models by sharing model parameters across distributed clients through a central server -without 1 extends cloud computing services closer to data sources sharing their data.In doing so, a central server aggregates model parameters from each participating client and then distribute the aggregated parameters, where ML models at each client are optimized using them -achieving inter-client transfer learning.In this direction, the recent works such as FedAvg (McMahan et al., 2017), FedProx (Li et al., 2020), FedCurv (Shoham et al., 2019) have introduced parameter aggregation techniques and shown improved learning at local clients -augmented by the parameters of foreign clients.
On the other hand, the edge devices generate a continuous stream of data where the data distribution can drift over time; therefore, the need for Continual Learning like humans do.Continual Learning (CL) (Thrun, 1995;Kumar and Daume III, 2012;Kirkpatrick et al., 2017;Schwarz et al., 2018;Gupta et al., 2020) empowers deep learning models to continually accumulate knowledge from a sequence of tasks -reusing historical knowledge while minimizing catastrophic forgetting (drift in learning of the historical tasks) over life time.
Federated Continual Learning (FCL): This work investigates the combination of the two paradigms of ML: Federated Learning and Continual Learning with an objective to model a sequence of tasks over time at each client via interclient transfer learning while preserving privacy and addressing heterogeneity of tasks across clients.There are two key challenges of FCL: (1) catastrophic forgetting, and (2) inter-client interference due to heterogeneity of tasks (domains) at clients.At central server, FedAvg (McMahan et al., 2017) aggregates-averages model parameters from each client without considering inter-client interference.To address this, FedWeIT (Yoon et al., 2021) approach performs FCL by sharing task-generic (via dense base parameters) and task-specific (via taskadaptive parameters) knowledge across clients.In doing so, at the server, they aggregate the dense base parameters however, no aggregation of the task-adaptive parameters, and then broadcast both the types of parameters.See further details in Figure 2 and section 2.2.FedWeIT, the first approach in FCL, investigates computer vision tasks (e.g., image classification), however the technique has limitations in aligning domains of foreign clients while augmented learning at each local client using task-adaptive parameters -that are often misaligned with local model parameters in parameter space (McMahan et al., 2017) due to heterogeneity in tasks.Therefore, a simple weighted additive composition technique does not address inter-client interference and determine domain relevance in foreign clients while performing transfer learning.

Contributions:
To the best of our knowledge, this is the first work that applies FCL to NLP task (text classification).At each local client, to maximize the inter-client transfer learning and minimize inter-client interference, we propose a novel approach, Federated Selective Inter-client Transfer (FedSeIT) that aligns domains of the foreign task-adaptive parameters via projection in the augmented transfer learning.To exploit the effectiveness of domain-relevance in handling a number of foreign clients, we further extend FedSeIT by a novel task selection strategy, Selective Inter-client Transfer (SIT) that efficiently selects the relevant task-adaptive parameters from the historical tasks of (many) foreign clients -assessing domain overlap at the global server using encoded data representations while preserving privacy.We evaluate our proposed approaches: FedSeIT and SIT for Text Classification task in FCL setup using five NLP datasets from diverse domains and show that they outperforms existing methods.Our main contributions are as follows: (1) We have introduced Federated Continual Learning paradigm to NLP task of text classification that collaboratively learns deep learning models at distributed clients through a global server while maintaining data localisation and continually learn over a sequence of tasks over life time -minimizing catastrophic forgetting, minimizing inter-client interference and maximizing inter-client knowledge transfer.
(2) We have presented novel techniques: Fed-SeIT and SIT that align domains and select relevant task-adaptive parameters of the foreign clients while augmented transfer learning at each client via a global server.Evaluating against the baselines, we have demonstrated improved performance, a  ... Challenges: However, there are two main sources of inter-client interference within FCL setup: (1) using a single global parameter θ G during parameter aggregation at server to capture the cross-client knowledge (Yoon et al., 2021) due to model parameters trained on irrelevant foreign client tasks, and (2) non-alignment of the foreign client model parameters given the heterogeneous task domains across clients.This leads to the hindrance of the local model training at each client by updating its parameters in erroneous directions, thus resulting in: (a) catastrophic forgetting of the client's historical tasks, and (b) sub-optimal learning of client's current task.For brevity, we will omit notation of round r from further equations and mathematical formulation except algorithms.c can be described as:

Federated Selective Inter-client Transfer
where, each client initializes B t c using θ G received from the server containing global knowledge, before training on task t, to enable inter-client knowledge transfer.Therefore, the first term signifies selective utilization of global knowledge using the mask parameter m t c , which restricts the impact of inter-client interference during server aggregation.Due to additive decomposition of parameters, the second term A t c captures task specific knowledge.Another key benefit of parameter decomposition is that by accessing task-adaptive parameters A t c of the past tasks from foreign clients, a client can selectively utilize task-specific knowledge of the relevant tasks, thus further minimizing inter-client interference and maximizing knowledge transfer.Consider training for task t at client c c using dataset T t c ≡ {x t i , y t i } N t i=1 , the FedSeIT framework segregates client's local model parameters θ t c from the foreign client's task-adaptive parameters {A t−1 c } C c=1 .As illustrated in Figure 1(a), for an input document x ∈ T t c an embedding matrix X ∈ R |x|×D is generated via embedding lookup and then using X the CNN model computes: (1) one client dense vector z t c ∈ R N F using local client parameters θ t c , and (2) C foreign dense vectors ẑt i ∈ R N F using foreign clients' task-adaptive parameters {A t−1 c } C c=1 as: where, i ∈ {1, ..., C}, |x| is the count of tokens in x, α is the attention parameter and CNN is the convolution & max-pooling function.Then we align the foreign parameters in parameter space by concatenating and projecting all foreign dense vectors {ẑ t i } to get a single foreign vector z t f ∈ R N F .Finally, we selectively augment the relevant knowledge from foreign clients by concatenating and projecting z t c , z t f for prediction of label ŷ as: is the model prediction and L t c is the unique label set in T t c .By segregation of foreign parameters via concatenation and projection of the CNN outputs, FedSeIT enables local client model to align foreign parameters in feature space and augment knowledge from the relevant foreign tasks.In doing so, FedSeIT effectively circumvents the inter-client interference due to heterogeneous task domains while maximizing inter-client transfers.
Comparison with FedWeIT: As illustrated in Figure 1(b), in contrast to our approach, in the baseline FedWeIT framework each client performs inter-client knowledge transfer via weighted composition of foreign task-adaptive parameters i.e., convolution filters, along with local parameters as: However, due to heterogeneity of task domains across clients, the simple additive aggregation of foreign client parameters with local parameters leads to inter-client interference and sub-optimal learning of the local task.Unlike FedWeIT (where the attention α parameters decide the relevance of foreign client parameters), the segregation of foreign client parameters in our proposed FedSeIT method enables the local model to selective augment relevant knowledge from foreign clients' taskadaptive parameters via projection.
Training: For task t at each client c c , the FCL optimization objective for local client model can be decomposed into three components as: where, ŷt i and y t i are the predicted and true label respectively for input x t i and N t is the number of Algorithm 1 Proposed FedSeIT Framework 1: Server initializes θG 2: Each client c ∈ C ≡ {1, ..., C} connects with server 3: for task t = 1, ..., T do 4: for round r = 1, ..., R do 5: Server transmits global parameter θG to all c ∈ C 6: Each client c ∈ C initializes B t c using θG 7: if r = 1 then 8: Each client c ∈ C initializes A t c , m t c 9: if r = 1 and t = 1 then 10: Server transmits {A Prepare model parameters: Each client c ∈ C minimizes equation 5 using equations 1,2,3 to learn task t in continual learning setup 13: Each client c ∈ C transmits B(t,r) c to server 14: to server 24: Server distributes {A documents.The first term is the model training objective for current task t.The second term is a sparsity objective to induce sparsity in the mask m t c and task specific parameters A t c for efficient server-client communication, where λ 1 is a hyperparameter to regulate sparsity.The final term is the continual learning regularization (Kirkpatrick et al., 2017) objective to minimize catastrophic forgetting by controlling the drift in parameters learned from the past tasks.Here, ∆B t c is the change in B t c between the current and previous task i.e., ∆B t c = B t c − B t−1 c , ∆A i c is the change in task-adaptive parameters for task i between the current and previous time-step and λ 2 is the hyperparameter to regulate sparsity.The task specific parameters A 1:t−1 c of the past tasks are updated to balance the change in B t c i.e., ∆B t c , which minimizes drift in the solutions learned for the past tasks.For higher values of λ 2 , the training objective set high penalty on forgetting of the previous tasks.Algorithm 1 describes the FedSeIT framework where, the server aggregation function Agg is agnostic to the choice of available methods like FedAvg, FedProx etc.In FedSeIT , we use FedAvg.
Dense Layer Sharing (DLS): Unlike the convolution filters of CNN model that captures the transferable n-gram patterns in the data, the dense layer  parameters {W t c , W t f } capture fine-grained alignment information based on the selection and ordering of unique output labels y and foreign client parameters {A t−1 c } C c=1 .In heterogeneity, the server aggregation and distribution of dense layer parameters introduce sub-optimal initialization point for training of future tasks across clients.Therefore, in FedSeIT, we do not share projection layer parameters with server by default.In doing so, we also increase client privacy in response to adversarial attacks.To validate our hypothesis, we evaluate a variant of our proposed model FedSeIT+DLS, where we share the dense projection layer parameters {W t c , W t f } in FedSeIT framework.

Selective Inter-client Transfer (SIT)
In FedSeIT, before training on task t, each client receives C foreign task-adaptive parameters {A t−1 1 , ..., A t−1 C } of the previous task t − 1 from each client via server s for inter-client transfer learning.However, given task heterogeneity, the previous task parameters might be irrelevant for learning current task which could lead to interclient interference.To resolve this, we could transmit parameters of all historical tasks from all foreign clients i.e., {A 1 1 , ..., A t−1 C }, to minimize interference by finding relevant parameters.But, this could lead to burgeoning of computational complexity and communication cost increasing with each completed task.Therefore, to tackle this, we propose Selective Inter-client Transfer (SIT) method, which uses encoded task representations to efficiently explore all historical tasks across foreign clients via domain overlap, and selects the relevant parameters to minimize inter-client interference and maximize knowledge transfer.
Client: For each task t, before training, each client c c generates the encoded vector representation x(x) ∈ R D for each document x in task dataset T t c via embedding lookup and averaging.After that, using K-Nearest Neighbor (KNN) or Gaussian Mixture Model (GMM) algorithm, each client performs clustering on the encoded vector representations as follows: EmbeddingLookup(x i , E) where, T t c ∈ R Q×D are the representations of the cluster centers, Clustering ∈ {KNN, GMM}, Q denotes the number of cluster centers and E is a pre-trained word embedding repository like Word2Vec (Mikolov et al., 2013).Ultimately, each client transmits T t c to the global server.Server: Once the server receives representations of cluster centers T t c of task t from client c c , it computes pairwise task-task domain overlap using average cosine-similarity score between the current client task T t c and each historical task across all clients { T 1 1 , ..., T t−1 C }.The server then selects and transmits top-K relevant (high similarity) parameters to the client for inter-client transfer learning, where K is a hyperparameter.Therefore, in scenarios where task history is long (t > 10) and/or clients are too many (C > 10), SIT can keep the computational complexity of client model constant while minimizing inter-client interference and maximizing knowledge transfer.To test this, we apply SIT in FedSeIT framework with K ∈ {3, 5}.
Comparison with FedWeIT: Such an approach of parameter selection by assessing domain relevance is missing in baseline FedWeIT framework, where to control the computational complexity and communication cost, each client transmits taskadaptive parameters of only the previous task to the server.However, as already discussed, these parameters could be irrelevant for learning current task, thus resulting in inter-client interference.

Experiments and Analysis
To demonstrate the effectiveness of our proposed FedSeIT framework, we perform evaluation on Text Classification task using five datasets from diverse domains and present our qualitative and quantitative analysis in FCL setup.
Here, R8 and TMN datasets belong to the News domain, TREC6 and TREC50 belong to Question Classification domain and SUBJ is a movie reviews dataset with binary labels.Please see Appendix A for more details regarding datasets.
Experimental setup: We follow Yoon et al. (2021) for our FCL experimental setup where we use CNN based Text Classification model (Kim, 2014) as the local client model.For all experiments, we use three clients i.e., C = 3, five tasks per client i.e., T = 5, 10 rounds per task i.e., R = 10, with 50 epochs in each round, λ 2 ∈ {0.1, 1.0} and K ∈ {3, 5} for Selective Inter-client Transfer (SIT).Please see Appendix C for detailed hyperparameter settings for all of our experiments.To run the experiments in FCL setup, we need to generate task datasets T t c for each task t.Consider a dataset D d with a unique label set where N d is the count of unique labels.Now, for each task dataset T t c , we randomly pick a fixed number of unique labels L t c ⊆ L d , where the count of unique task labels |L t c | = 4 is fixed for all tasks across all clients except subjectivity, which has 2 unique labels shared by all tasks.If label L 1 gets selected for 3 tasks, then we follow the non-iid splitting strategy which simply divides the documents labeled with L 1 i.e., D d (L 1 ), into three mutually exclusive and equal parts, thus ensuring heterogeneity.We use this strategy to split the training and validation sets.However, to create the test set for each task dataset T t c , we select all of the documents labeled with L t c in the complete test dataset i.e., {D test d (L i )|L i ∈ L t c } without splitting.As this work focuses on the new challenges which arise due to combination of FL and CL paradigms in FCL setup, we compare our work with related FCL methods and not with standalone FL, CL methods.
Baseline: As the only existing work in FCL domain, FedWeIT has shown significantly superior performance compared to näive FCL methods that is why we adopt FedWeIT as the baseline method.
Evaluation Metric: In each experiment, once the training is finished for all C clients, we freeze the model parameters for all tasks and compute Micro-averaged Accuracy (MAA) score for each of the T past tasks of all C clients.Finally, we average C × T MAA scores to compute the final Task-averaged Test Accuracy (TTA) score to compare our proposed model with baseline.For each experiment, we report an average TTA score of 3 runs using 3 different seed values for ordering of tasks at each client.

Results: Comparison to baseline
Table 2 shows final TTA scores after completion of all tasks across all clients on five Text Classification datasets.We find the following observations: (1) For all the datasets, our proposed model Fed-SeIT consistently outperforms the baseline method FedWeIT.For instance, for the data R8, the classification accuracy is 79.1% vs 90.5% from FedWeIT and FedSeIT (without SIT), respectively.Overall (column 3 vs column 2), on an average over the five datasets, FedSeIT model outperforms Fed-WeIT by 6.4% and 12.4% for λ 2 = 1.0 (higher penalty on catastrophic forgetting) and λ 2 = 0.1 (lower penalty) respectively.We then observe that FedSeIT also applies to sparse-setting, for example, small (R8) vs large (TMN) corpora.See Table A for the data statistics.The results suggest that the selective utilization and domain-alignment of taskadaptive parameters at local clients prevent interclient interference and maximise transfer learning.
(2) To demonstrate the effectiveness of Selective Inter-client Transfer (SIT) approach by limiting the number of foreign parameters (K = 3 or 5) at local client while augment-learning, Table 2 shows a comparison between FedSeIT models (trained without and with SIT) and FedWeIT baseline.For all the datasets, it is observed that the test scores of FedSeIT trained with SIT (K = 3) are similar to FedSeIT trained without SIT (C = 3) while having the same computational expense; however, reducing the number of parameters when the length of sequence of historical tasks grows at the clients.
Additionally, we observe an improved perfor-mance in the classification by increasing K = 5, i.e., extending the window-size of the number of foreign parameters (considered in augmentedlearning) from a sequence of historical tasks in continual learning.For instance, on an average, the performance gain of FedSeIT vs FedWeIT (column 5 and 2) is: 6.8% and 12.5% for λ 2 = 1.0 and λ 2 = 0.1, respectively.This suggests that the more relevant and domain-aligned foreign parameters boost the augmented-learning at each local client, i.e, by selecting relevant foreign parameters from all historical tasks of foreign clients using SIT.
(3) Next, we investigate the application of denselayer-sharing (DLS) in FedSeIT , in order to compare with FedSeIT (without DLS) and FedWeIT.For all the datasets, note that the FedSeIT outperforms FedSeIT+DLS, validating our hypothesis that using exclusive dense layer parameters at local client boots domain-alignment and identification of relevant foreign parameters.Overall an average over the five datasets, FedSeIT outperforms Fed-SeIT+DLS (column 3 vs 6) by 4.4% and 19.1% for λ 2 =1.0 and λ 2 =0.1, respectively.In summary, evaluating against the baseline Fed-WeIT approach, the proposed FedSeIT have shown improved performance, a average gain of 12.4% in text classification over a sequence of tasks using the five datasets from diverse domains.

Ablation: Learning over training rounds
Here, we demonstrate the performance of text classification at a local client (3rd client) for a sequence of five tasks at each training round (10 rounds).Figure 5 shows the test set accuracy scores (at the end of each training round) for all five tasks of Client-3 using R8 dataset.In 4/5 tasks, the test accuracy score of the FedSeIT model at the end of round 1 is noticeably higher than the FedWeITapproach.Interestingly, in task 2, we find that FedWeIT outperforms FedSeIT over rounds, however converge at the same accuracy in the final round.
In essence, the proposed method FedSeIT in FCL setup have shown that the alignment and relevance of foreign tasks parameters at each client (for all the tasks at each model training round) minimise inter-client interference and improve inter-client transfer learning without the dense-layer-sharing.

Conclusion and Future Work
We have applied Federated Continual Learning to text classification for heterogeneous tasks and addressed the challenges of inter-client interference and domain-alignment in model parameters of local vs foreign clients while minimizing catastrophic forgetting over a sequence of tasks.We have presented two novel techniques: FedSeIT and SIT that improves local client augmented-learning by assessing domain overlap and selecting informative tasks from the sequence of historical tasks of each foreign client while preserving privacy.Furthermore, the novel selection strategy using SIT determines relevant foreign tasks from the complete historical tasks of all foreign clients by assessing domain overlap while preserving privacy.We have evaluated the proposed approaches using five text classification data sets and shown a gain (average) of 12.4% over the baseline.
Although we have applied FedSeIT framework to the document-level text classification task, we can further apply the proposed framework to additional NLP tasks.Inspired by continual Topic Modeling (Gupta et al., 2020) at document-level and continual Named Entity Recognition (Monaikul et al., 2021) at token-level classification, we can further extend these existing works with the proposed FedSeIT framework in federated settings.

Limitations
In FCL paradigm, the computation complexity of augmented-learning at a client increases when the number of foreign clients grows exponentially.In future work, it is an interesting research direction to explore hierarchical federated learning techniques (Abad et al., 2020) to limit the number of foreign client parameters injected into augmented-learning (applying convolution filters and projections in CNN of FedSeIT) at a local client.Additionally, due to limited compute on edge devices such as mobiles, wearable devices, sensors, etc., the application of FCL in the cloud-edge continuum is still in early days that requires distillation and pruning of large ML models such as CNN for text classification -as presented in this paper.

A Datasets
Table 3 shows detailed data statistics of five labeled datasets used to evaluate our proposed FedSeIT framework using Text Classification task: Reuters8 (R8) and Tag My News (TMN) datasets belong to the News domain, TREC6 and TREC50 belong to the Question Classification domain and Subjectivity (SUBJ) is a movie reviews dataset with binary labels.

B Local client model
In FedSeIT we use CNN (Kim, 2014)  are a set of convolutional filters where, l is the layer indicator, L Conv is the total number of Conv layers, F l is the filter size for layer l, D is the input word embedding dimension and N F l is the count of filters in layer l, and (2) For FC layers: B t c , A t c , θ t c ∈ {R I l ×O l } L F C l=1 are a set of parameter matrices where, l is the layer indicator, L F C is the total number of FC layers and I l , O l are the input, output dimensions for layer l.For all layers, m t c is the masking vector matching the output dimension of B t c .As illustrated in Figure 3, for an input document x, the CNN model performs three different tasks: (1) performing word embedding lookup of x to generate an input matrix X ∈ R |x|×D , (2) applying convolutional filters and max-pooling over X to generate an intermediate dense vector z ∈ R N F , and (3) applying softmax layer on z to predict the label for input x.

C Experimental Setup
Table 4 shows the settings of all hyperparameters used in the experimental setup to evaluate our proposed FedSeIT framework on Text Classification task using 5 datasets.

D Reproducibility: Code
To run the experiments and reproduce the scores reported in the paper content, we have provided the implementation of our proposed FedSeIT framework at https://github.com/RaiPranav/FCL-FedSeIT.Information regarding model training and data preprocessing is provided in the README file.

Figure 1 :
Figure 1: (a) Illustration of the proposed FedSeIT framework where, task-adaptive parameters of foreign clients are segregated and domain-aligned for selective utilization.HowToRead: Note the coloring scheme in convolution filters of local and foreign clients and their application in convolution.(b) Weighted additive filter composition performed in the baseline model: FedWeIT.Note the composite θ t c vs segregated convolution filters of FedSeIT.
To tackle the above-mentioned challenges, we propose Federated Selective Inter-client Transfer (Fed-SeIT) framework which aims to minimize interclient interference and communication cost while maximizing inter-client knowledge transfer in FCL paradigm.Motivated by Yoon et al. (2021), Fed-SeIT model decomposes each client's model parameters θ t c into a set of three different parameters: (1) dense local base parameters B t c which captures and accumulates the task-generic knowledge across client's private task sequence T c , (2) sparse task-adaptive parameters A t c which captures the task-specific knowledge for each task in T c , and (3) sparse mask parameters m t c which allow client model to selectively utilize the global knowledge.For each client c c , B t c is randomly initialized only once before training on the first task and shared throughout the task sequence T c , while a new A t c and m t c parameters are initialized for each task t.At the global server, we have global parameter θ G which accumulates task-generic knowledge across all clients i.e., global knowledge, by aggregating local base parameters sent from all clients.Finally, for each client c c and task t, the model parameters θ t

Figure 2 :
Figure 2: Inter-client Transfer Learning in FCL: (top) broadcasting client-to-server model parameters of client c c after training the task t − 1; (middle) parameter aggregation at the server; and (bottom) reception of the parameters at the client c c before training the task t

Figure 3 :
Figure 3: A detailed illustration of CNN client model used in FedSeIT framework.

Figure 4 :
Figure 4: Detailed illustration of the proposed SIT method, where foreign task-adaptive parameters are selected based on task domain-relevance to maximize inter-client transfer learning and minimize communication cost.

Figure 5 :
Figure 5: Test set accuracy scores for all 5 tasks of Client-3 in FedSeIT framework using Reuters8 dataset (λ 2 = 1.0).Each data point denotes the test accuracy score at the end of each training round for 10 rounds of 50 epochs.

Table 1 :
Description of the notations used in this work where matrices and vectors are denoted by uppercase and lowercase bold characters respectively. r)

Table 2 :
Comparison of our proposed FedSeIT framework (with and without SIT) against FedWeIT baseline model using Task-averaged Test Accuracy (TTA) scores for two different values of λ 2 ∈ {1.0, 0.1}.Best score for each dataset (row) is shown in bold and Gain (%) denotes Bold vs FedWeIT.
model for Text Classification as the local client model.The CNN model is made up of convolution (Conv) layers and fully connected (FC) layers.The CNN model parameters can be described as follows: (1) For Conv layers:B t c , A t c , θ t c ∈ {R F l ×D×N F l } L Conv l=1