Experience as Source for Anticipation and Planning: Experiential Policy Learning for Target-driven Recommendation Dialogues

Huy Quang Dao; Yang Deng; Khanh-Huyen Bui; Dung D. Le; Lizi Liao

doi:10.18653/v1/2024.findings-emnlp.829

Experience as Source for Anticipation and Planning: Experiential Policy Learning for Target-driven Recommendation Dialogues

Huy Quang Dao, Yang Deng, Khanh-Huyen Bui, Dung D. Le, Lizi Liao

Abstract

Target-driven recommendation dialogues present unique challenges in dialogue management due to the necessity of anticipating user interactions for successful conversations. Current methods face significant limitations: (I) inadequate capabilities for conversation anticipation, (II) computational inefficiencies due to costly simulations, and (III) neglect of valuable past dialogue experiences. To address these limitations, we propose a new framework, Experiential Policy Learning (EPL), for enhancing such dialogues. EPL embodies the principle of Learning From Experience, facilitating anticipation with an experiential scoring function that estimates dialogue state potential using similar past interactions stored in long-term memory. To demonstrate its flexibility, we introduce Tree-structured EPL (T-EPL) as one possible training-free realization with Large Language Models (LLMs) and Monte-Carlo Tree Search (MCTS). T-EPL assesses past dialogue states with LLMs while utilizing MCTS to achieve hierarchical and multi-level reasoning. Extensive experiments on two published datasets demonstrate the superiority and efficacy of T-EPL.

Anthology ID:: 2024.findings-emnlp.829
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2024
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 14179–14198
Language:
URL:: https://aclanthology.org/2024.findings-emnlp.829/
DOI:: 10.18653/v1/2024.findings-emnlp.829
Bibkey:
Cite (ACL):: Huy Quang Dao, Yang Deng, Khanh-Huyen Bui, Dung D. Le, and Lizi Liao. 2024. Experience as Source for Anticipation and Planning: Experiential Policy Learning for Target-driven Recommendation Dialogues. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 14179–14198, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: Experience as Source for Anticipation and Planning: Experiential Policy Learning for Target-driven Recommendation Dialogues (Dao et al., Findings 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.findings-emnlp.829.pdf

PDF Cite Search Fix data