SAGE: Steering Dialog Generation with Future-Aware State-Action Augmentation

Yizhe Zhang; Navdeep Jaitly

doi:10.18653/v1/2025.nlperspectives-1.11

SAGE: Steering Dialog Generation with Future-Aware State-Action Augmentation

Abstract

Recent advances in large language models have enabled impressive task-oriented applications, yet building emotionally intelligent chatbots for natural, strategic conversations remains challenging. Current approaches often assume a single “ground truth” for emotional responses, overlooking the subjectivity of human emotion. We present a novel perspectivist approach, SAGE, that models multiple perspectives in dialogue generation using latent variables. At its core is the State-Action Chain (SAC), which augments standard fine-tuning with latent variables capturing diverse emotional states and conversational strategies between turns, in a future-looking manner. During inference, these variables are generated before each response, enabling multi-perspective control while preserving natural interactions. We also introduce a self-improvement pipeline combining dialogue tree search, LLM-based reward modeling, and targeted fine-tuning to optimize conversational trajectories. Experiments show improved LLM-based judgments while maintaining strong general LLM performance. The discrete latent variables further enable search-based strategies and open avenues for state-level reinforcement learning in dialogue systems, where learning can occur at the state level rather than the token level.

Anthology ID:: 2025.nlperspectives-1.11
Volume:: Proceedings of the The 4th Workshop on Perspectivist Approaches to NLP
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Gavin Abercrombie, Valerio Basile, Simona Frenda, Sara Tonelli, Shiran Dudy
Venues:: NLPerspectives | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 123–132
Language:
URL:: https://aclanthology.org/2025.nlperspectives-1.11/
DOI:: 10.18653/v1/2025.nlperspectives-1.11
Bibkey:
Cite (ACL):: Yizhe Zhang and Navdeep Jaitly. 2025. SAGE: Steering Dialog Generation with Future-Aware State-Action Augmentation. In Proceedings of the The 4th Workshop on Perspectivist Approaches to NLP, pages 123–132, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: SAGE: Steering Dialog Generation with Future-Aware State-Action Augmentation (Zhang & Jaitly, NLPerspectives 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.nlperspectives-1.11.pdf

PDF Cite Search Fix data