Zhaorun Chen
2024
DGLF: A Dual Graph-based Learning Framework for Multi-modal Sarcasm Detection
Zhihong Zhu
|
Kefan Shen
|
Zhaorun Chen
|
Yunyan Zhang
|
Yuyan Chen
|
Xiaoqi Jiao
|
Zhongwei Wan
|
Shaorong Xie
|
Wei Liu
|
Xian Wu
|
Yefeng Zheng
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
AutoPRM: Automating Procedural Supervision for Multi-Step Reasoning via Controllable Question Decomposition
Zhaorun Chen
|
Zhuokai Zhao
|
Zhihong Zhu
|
Ruiqi Zhang
|
Xiang Li
|
Bhiksha Raj
|
Huaxiu Yao
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Recent advancements in large language models (LLMs) have shown promise in multi-step reasoning tasks, yet their reliance on extensive manual labeling to provide procedural feedback remains a significant impediment. To address this challenge, in this paper, we propose a novel self-supervised framework **AutoPRM** that efficiently enhances the fine-tuning of LLMs for intricate reasoning challenges. Specifically, **AutoPRM** first decomposes complex problems into more manageable subquestions with a controllable granularity switch, then sequentially apply reinforcement learning to iteratively improve the subquestion solver. Additionally, we propose context-guided decoding to avoid reward tampering and guide the subquestion solver towards the solution of the holistic problem. Extensive experiments show that **AutoPRM** significantly improves performance on mathematical and commonsense reasoning tasks over SOTA. More encouragingly, **AutoPRM** can be easily integrated with other orthogonal reasoning pipelines.
Search
Co-authors
- Zhihong Zhu 2
- Kefan Shen 1
- Yunyan Zhang 1
- Yuyan Chen 1
- Xiaoqi Jiao 1
- show all...