Automatic summarization aims to extract important information from large amounts of textual data in order to create a shorter version of the original texts while preserving its information. Training traditional extractive summarization models relies heavily on human-engineered labels such as sentence-level annotations of summary-worthiness. However, in many use cases, such human-engineered labels do not exist and manually annotating thousands of documents for the purpose of training models may not be feasible. On the other hand, indirect signals for summarization are often available, such as agent actions for customer service dialogues, headlines for news articles, diagnosis for Electronic Health Records, etc. In this paper, we develop a general framework that generates extractive summarization as a byproduct of supervised learning tasks for indirect signals via the help of attention mechanism. We test our models on customer service dialogues and experimental results demonstrated that our models can reliably select informative sentences and words for automatic summarization.
Automating Template Creation for Ranking-Based Dialogue Models
Jingxiang Chen | Heba Elfardy | Simi Wang | Andrea Kahn | Jared Kramer
Proceedings of the 2nd Workshop on Natural Language Processing for Conversational AI
Dialogue response generation models that use template ranking rather than direct sequence generation allow model developers to limit generated responses to pre-approved messages. However, manually creating templates is time-consuming and requires domain expertise. To alleviate this problem, we explore automating the process of creating dialogue templates by using unsupervised methods to cluster historical utterances and selecting representative utterances from each cluster. Specifically, we propose an end-to-end model called Deep Sentence Encoder Clustering (DSEC) that uses an auto-encoder structure to jointly learn the utterance representation and construct template clusters. We compare this method to a random baseline that randomly assigns templates to clusters as well as a strong baseline that performs the sentence encoding and the utterance clustering sequentially. To evaluate the performance of the proposed method, we perform an automatic evaluation with two annotated customer service datasets to assess clustering effectiveness, and a human-in-the-loop experiment using a live customer service application to measure the acceptance rate of the generated templates. DSEC performs best in the automatic evaluation, beats both the sequential and random baselines on most metrics in the human-in-the-loop experiment, and shows promising results when compared to gold/manually created templates.