Dialog policy optimization for low resource setting using Self-play and Reward based Sampling

Tharindu Madusanka, Durashi Langappuli, Thisara Welmilla, Uthayasanker Thayasivam, Sanath Jayasena


Anthology ID:
2020.paclic-1.21
Volume:
Proceedings of the 34th Pacific Asia Conference on Language, Information and Computation
Month:
October
Year:
2020
Address:
Hanoi, Vietnam
Editors:
Minh Le Nguyen, Mai Chi Luong, Sanghoun Song
Venue:
PACLIC
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
178–187
Language:
URL:
https://aclanthology.org/2020.paclic-1.21
DOI:
Bibkey:
Cite (ACL):
Tharindu Madusanka, Durashi Langappuli, Thisara Welmilla, Uthayasanker Thayasivam, and Sanath Jayasena. 2020. Dialog policy optimization for low resource setting using Self-play and Reward based Sampling. In Proceedings of the 34th Pacific Asia Conference on Language, Information and Computation, pages 178–187, Hanoi, Vietnam. Association for Computational Linguistics.
Cite (Informal):
Dialog policy optimization for low resource setting using Self-play and Reward based Sampling (Madusanka et al., PACLIC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.paclic-1.21.pdf