Alan Wagner


2022

pdf bib
A POMDP Dialogue Policy with 3-way Grounding and Adaptive Sensing for Learning through Communication
Maryam Zare | Alan Wagner | Rebecca Passonneau
Findings of the Association for Computational Linguistics: EMNLP 2022

Agents to assist with rescue, surgery, and similar activities could collaborate better with humans if they could learn new strategic behaviors through communication. We introduce a novel POMDP dialogue policy for learning from people. The policy has 3-way grounding of language in the shared physical context, the dialogue context, and persistent knowledge. It can learn distinct but related games, and can continue learning across dialogues for complex games. A novel sensing component supports adaptation to information-sharing differences across people. The single policy performs better than oracle policies customized to specific games and information behavior.

2020

pdf bib
Dialogue Policies for Learning Board Games through Multimodal Communication
Maryam Zare | Ali Ayub | Aishan Liu | Sweekar Sudhakara | Alan Wagner | Rebecca Passonneau
Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue

This paper presents MDP policy learning for agents to learn strategic behavior–how to play board games–during multimodal dialogues. Policies are trained offline in simulation, with dialogues carried out in a formal language. The agent has a temporary belief state for the dialogue, and a persistent knowledge store represented as an extensive-form game tree. How well the agent learns a new game from a dialogue with a simulated partner is evaluated by how well it plays the game, given its dialogue-final knowledge state. During policy training, we control for the simulated dialogue partner’s level of informativeness in responding to questions. The agent learns best when its trained policy matches the current dialogue partner’s informativeness. We also present a novel data collection for training natural language modules. Human subjects who engaged in dialogues with a baseline system rated the system’s language skills as above average. Further, results confirm that human dialogue partners also vary in their informativeness.