Tao He


2024

pdf bib
Navigate through Enigmatic Labyrinth A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future
Zheng Chu | Jingchang Chen | Qianglong Chen | Weijiang Yu | Tao He | Haotian Wang | Weihua Peng | Ming Liu | Bing Qin | Ting Liu
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Reasoning, a fundamental cognitive process integral to human intelligence, has garnered substantial interest within artificial intelligence.Notably, recent studies have revealed that chain-of-thought prompting significantly enhances LLM’s reasoning capabilities, which attracts widespread attention from both academics and industry.In this paper, we systematically investigate relevant research, summarizing advanced methods through a meticulous taxonomy that offers novel perspectives.Moreover, we delve into the current frontiers and delineate the challenges and future directions, thereby shedding light on future research.Furthermore, we engage in a discussion about open questions.We hope this paper serves as an introduction for beginners and fosters future research.Resources have been made publicly available at https://github.com/zchuz/CoT-Reasoning-Survey

pdf bib
Planning Like Human: A Dual-process Framework for Dialogue Planning
Tao He | Lizi Liao | Yixin Cao | Yuanxing Liu | Ming Liu | Zerui Chen | Bing Qin
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

In proactive dialogue, the challenge lies not just in generating responses but in steering conversations toward predetermined goals, a task where Large Language Models (LLMs) typically struggle due to their reactive nature. Traditional approaches to enhance dialogue planning in LLMs, ranging from elaborate prompt engineering to the integration of policy networks, either face efficiency issues or deliver suboptimal performance. Inspired by the dual-process theory in psychology, which identifies two distinct modes of thinking—intuitive (fast) and analytical (slow), we propose the Dual-Process Dialogue Planning (DPDP) framework. DPDP embodies this theory through two complementary planning systems: an instinctive policy model for familiar contexts and a deliberative Monte Carlo Tree Search (MCTS) mechanism for complex, novel scenarios. This dual strategy is further coupled with a novel two-stage training regimen: offline Reinforcement Learning for robust initial policy model formation followed by MCTS-enhanced on-the-fly learning, which ensures a dynamic balance between efficiency and strategic depth. Our empirical evaluations across diverse dialogue tasks affirm DPDP’s superiority in achieving both high-quality dialogues and operational efficiency, outpacing existing methods.