Mitigating the Alignment Tax of RLHF Yong Lin author Hangyu Lin author Wei Xiong author Shizhe Diao author Jianmeng Liu author Jipeng Zhang author Rui Pan author Haoxiang Wang author Wenbin Hu author Hanning Zhang author Hanze Dong author Renjie Pi author Han Zhao author Nan Jiang author Heng Ji author Yuan Yao author Tong Zhang author 2024-11 text Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing Yaser Al-Onaizan editor Mohit Bansal editor Yun-Nung Chen editor Association for Computational Linguistics Miami, Florida, USA conference publication lin-etal-2024-mitigating 10.18653/v1/2024.emnlp-main.35 https://aclanthology.org/2024.emnlp-main.35/ 2024-11 580 606