Scaling Language Model Size in Cross-Device Federated Learning

Jae Ro; Theresa Breiner; Lara McConnaughey; Mingqing Chen; Ananda Suresh; Shankar Kumar; Rajiv Mathews

doi:10.18653/v1/2022.fl4nlp-1.2

Scaling Language Model Size in Cross-Device Federated Learning

Jae Ro, Theresa Breiner, Lara McConnaughey, Mingqing Chen, Ananda Suresh, Shankar Kumar, Rajiv Mathews

Abstract

Most studies in cross-device federated learning focus on small models, due to the server-client communication and on-device computation bottlenecks. In this work, we leverage various techniques for mitigating these bottlenecks to train larger language models in cross-device federated learning. With systematic applications of partial model training, quantization, efficient transfer learning, and communication-efficient optimizers, we are able to train a 21M parameter Transformer that achieves the same perplexity as that of a similarly sized LSTM with ∼10× smaller client-to-server communication cost and 11% lower perplexity than smaller LSTMs commonly studied in literature.

Anthology ID:: 2022.fl4nlp-1.2
Volume:: Proceedings of the First Workshop on Federated Learning for Natural Language Processing (FL4NLP 2022)
Month:: May
Year:: 2022
Address:: Dublin, Ireland
Editors:: Bill Yuchen Lin, Chaoyang He, Chulin Xie, Fatemehsadat Mireshghallah, Ninareh Mehrabi, Tian Li, Mahdi Soltanolkotabi, Xiang Ren
Venue:: FL4NLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 6–20
Language:
URL:: https://aclanthology.org/2022.fl4nlp-1.2
DOI:: 10.18653/v1/2022.fl4nlp-1.2
Bibkey:
Cite (ACL):: Jae Ro, Theresa Breiner, Lara McConnaughey, Mingqing Chen, Ananda Suresh, Shankar Kumar, and Rajiv Mathews. 2022. Scaling Language Model Size in Cross-Device Federated Learning. In Proceedings of the First Workshop on Federated Learning for Natural Language Processing (FL4NLP 2022), pages 6–20, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):: Scaling Language Model Size in Cross-Device Federated Learning (Ro et al., FL4NLP 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.fl4nlp-1.2.pdf

PDF Cite Search