Zhaorun Chen


2024

pdf bib
DGLF: A Dual Graph-based Learning Framework for Multi-modal Sarcasm Detection
Zhihong Zhu | Kefan Shen | Zhaorun Chen | Yunyan Zhang | Yuyan Chen | Xiaoqi Jiao | Zhongwei Wan | Shaorong Xie | Wei Liu | Xian Wu | Yefeng Zheng
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

pdf bib
AutoPRM: Automating Procedural Supervision for Multi-Step Reasoning via Controllable Question Decomposition
Zhaorun Chen | Zhuokai Zhao | Zhihong Zhu | Ruiqi Zhang | Xiang Li | Bhiksha Raj | Huaxiu Yao
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Recent advancements in large language models (LLMs) have shown promise in multi-step reasoning tasks, yet their reliance on extensive manual labeling to provide procedural feedback remains a significant impediment. To address this challenge, in this paper, we propose a novel self-supervised framework **AutoPRM** that efficiently enhances the fine-tuning of LLMs for intricate reasoning challenges. Specifically, **AutoPRM** first decomposes complex problems into more manageable subquestions with a controllable granularity switch, then sequentially apply reinforcement learning to iteratively improve the subquestion solver. Additionally, we propose context-guided decoding to avoid reward tampering and guide the subquestion solver towards the solution of the holistic problem. Extensive experiments show that **AutoPRM** significantly improves performance on mathematical and commonsense reasoning tasks over SOTA. More encouragingly, **AutoPRM** can be easily integrated with other orthogonal reasoning pipelines.