DialCoT Meets PPO: Decomposing and Exploring Reasoning Paths in Smaller Language Models Chengcheng Han author Xiaowei Du author Che Zhang author Yixin Lian author Xiang Li author Ming Gao author Baoyuan Wang author 2023-12 text Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing Houda Bouamor editor Juan Pino editor Kalika Bali editor Association for Computational Linguistics Singapore conference publication han-etal-2023-dialcot 10.18653/v1/2023.emnlp-main.501 https://aclanthology.org/2023.emnlp-main.501/ 2023-12 8055 8068