trlX: A Framework for Large Scale Reinforcement Learning from Human Feedback

trlX: A Framework for Large Scale Reinforcement Learning from Human Feedback Alexander Havrilla author Maksym Zhuravinskyi author Duy Phung author Aman Tiwari author Jonathan Tow author Stella Biderman author Quentin Anthony author Louis Castricato author 2023-12 text Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing Houda Bouamor editor Juan Pino editor Kalika Bali editor Association for Computational Linguistics Singapore conference publication havrilla-etal-2023-trlx 10.18653/v1/2023.emnlp-main.530 https://aclanthology.org/2023.emnlp-main.530/ 2023-12 8578 8595