Empirical Studies of Institutional Federated Learning For Natural Language Processing

Xinghua Zhu, Jianzong Wang, Zhenhou Hong, Jing Xiao


Abstract
Federated learning has sparkled new interests in the deep learning society to make use of isolated data sources from independent institutes. With the development of novel training tools, we have successfully deployed federated natural language processing networks on GPU-enabled server clusters. This paper demonstrates federated training of a popular NLP model, TextCNN, with applications in sentence intent classification. Furthermore, differential privacy is introduced to protect participants in the training process, in a manageable manner. Distinguished from previous client-level privacy protection schemes, the proposed differentially private federated learning procedure is defined in the dataset sample level, inherent with the applications among institutions instead of individual users. Optimal settings of hyper-parameters for the federated TextCNN model are studied through comprehensive experiments. We also evaluated the performance of federated TextCNN model under imbalanced data load configuration. Experiments show that, the sampling ratio has a large impact on the performance of the FL models, causing up to 38.4% decrease in the test accuracy, while they are robust to different noise multiplier levels, with less than 3% variance in the test accuracy. It is also found that the FL models are sensitive to data load balancedness among client datasets. When the data load is imbalanced, model performance dropped by up to 10%.
Anthology ID:
2020.findings-emnlp.55
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2020
Month:
November
Year:
2020
Address:
Online
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
625–634
Language:
URL:
https://aclanthology.org/2020.findings-emnlp.55
DOI:
10.18653/v1/2020.findings-emnlp.55
Bibkey:
Cite (ACL):
Xinghua Zhu, Jianzong Wang, Zhenhou Hong, and Jing Xiao. 2020. Empirical Studies of Institutional Federated Learning For Natural Language Processing. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 625–634, Online. Association for Computational Linguistics.
Cite (Informal):
Empirical Studies of Institutional Federated Learning For Natural Language Processing (Zhu et al., Findings 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.findings-emnlp.55.pdf