Towards Exploiting Background Knowledge for Building Conversation Systems

Nikita Moghe, Siddhartha Arora, Suman Banerjee, Mitesh M. Khapra


Abstract
Existing dialog datasets contain a sequence of utterances and responses without any explicit background knowledge associated with them. This has resulted in the development of models which treat conversation as a sequence-to-sequence generation task (i.e., given a sequence of utterances generate the response sequence). This is not only an overly simplistic view of conversation but it is also emphatically different from the way humans converse by heavily relying on their background knowledge about the topic (as opposed to simply relying on the previous sequence of utterances). For example, it is common for humans to (involuntarily) produce utterances which are copied or suitably modified from background articles they have read about the topic. To facilitate the development of such natural conversation models which mimic the human process of conversing, we create a new dataset containing movie chats wherein each response is explicitly generated by copying and/or modifying sentences from unstructured background knowledge such as plots, comments and reviews about the movie. We establish baseline results on this dataset (90K utterances from 9K conversations) using three different models: (i) pure generation based models which ignore the background knowledge (ii) generation based models which learn to copy information from the background knowledge when required and (iii) span prediction based models which predict the appropriate response span in the background knowledge.
Anthology ID:
D18-1255
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Editors:
Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
2322–2332
Language:
URL:
https://aclanthology.org/D18-1255/
DOI:
10.18653/v1/D18-1255
Bibkey:
Cite (ACL):
Nikita Moghe, Siddhartha Arora, Suman Banerjee, and Mitesh M. Khapra. 2018. Towards Exploiting Background Knowledge for Building Conversation Systems. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2322–2332, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Towards Exploiting Background Knowledge for Building Conversation Systems (Moghe et al., EMNLP 2018)
Copy Citation:
PDF:
https://aclanthology.org/D18-1255.pdf
Attachment:
 D18-1255.Attachment.zip
Video:
 https://aclanthology.org/D18-1255.mp4
Code
 nikitacs16/Holl-E
Data
Holl-E