A POMDP Dialogue Policy with 3-way Grounding and Adaptive Sensing for Learning through Communication

Maryam Zare, Alan Wagner, Rebecca Passonneau


Abstract
Agents to assist with rescue, surgery, and similar activities could collaborate better with humans if they could learn new strategic behaviors through communication. We introduce a novel POMDP dialogue policy for learning from people. The policy has 3-way grounding of language in the shared physical context, the dialogue context, and persistent knowledge. It can learn distinct but related games, and can continue learning across dialogues for complex games. A novel sensing component supports adaptation to information-sharing differences across people. The single policy performs better than oracle policies customized to specific games and information behavior.
Anthology ID:
2022.findings-emnlp.504
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2022
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6767–6780
Language:
URL:
https://aclanthology.org/2022.findings-emnlp.504
DOI:
10.18653/v1/2022.findings-emnlp.504
Bibkey:
Cite (ACL):
Maryam Zare, Alan Wagner, and Rebecca Passonneau. 2022. A POMDP Dialogue Policy with 3-way Grounding and Adaptive Sensing for Learning through Communication. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 6767–6780, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
A POMDP Dialogue Policy with 3-way Grounding and Adaptive Sensing for Learning through Communication (Zare et al., Findings 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.findings-emnlp.504.pdf
Video:
 https://aclanthology.org/2022.findings-emnlp.504.mp4