%0 Conference Proceedings
%T GPT-NeoX-20B: An Open-Source Autoregressive Language Model
%A Black, Sidney
%A Biderman, Stella
%A Hallahan, Eric
%A Anthony, Quentin
%A Gao, Leo
%A Golding, Laurence
%A He, Horace
%A Leahy, Connor
%A McDonell, Kyle
%A Phang, Jason
%A Pieler, Michael
%A Prashanth, Usvsn Sai
%A Purohit, Shivanshu
%A Reynolds, Laria
%A Tow, Jonathan
%A Wang, Ben
%A Weinbach, Samuel
%Y Fan, Angela
%Y Ilic, Suzana
%Y Wolf, Thomas
%Y Gallé, Matthias
%S Proceedings of BigScience Episode #5 – Workshop on Challenges & Perspectives in Creating Large Language Models
%D 2022
%8 May
%I Association for Computational Linguistics
%C virtual+Dublin
%F black-etal-2022-gpt
%X We introduce GPT-NeoX-20B, a 20 billion parameter autoregressive language model trained on the Pile, whose weights will be made freely and openly available to the public through a permissive license. It is, to the best of our knowledge, the largest dense autoregressive model that has publicly available weights at the time of submission. In this work, we describe GPT-NeoX-20B’s architecture and training, and evaluate its performance. We open-source the training and evaluation code, as well as the model weights, at https://github.com/EleutherAI/gpt-neox.
%R 10.18653/v1/2022.bigscience-1.9
%U https://aclanthology.org/2022.bigscience-1.9
%U https://doi.org/10.18653/v1/2022.bigscience-1.9
%P 95-136