Sungjin Lee

Also published as: Sung-Jin Lee


2021

pdf bib
Dialogue Response Generation via Contrastive Latent Representation Learning
Shuyang Dai | Guoyin Wang | Sunghyun Park | Sungjin Lee
Proceedings of the 3rd Workshop on Natural Language Processing for Conversational AI

Large-scale auto-regressive models have achieved great success in dialogue response generation, with the help of Transformer layers. However, these models do not learn a representative latent space of the sentence distribution, making it hard to control the generation. Recent works have tried on learning sentence representations using Transformer-based framework, but do not model the context-response relationship embedded in the dialogue datasets. In this work, we aim to construct a robust sentence representation learning model, that is specifically designed for dialogue response generation, with Transformer-based encoder-decoder structure. An utterance-level contrastive learning is proposed, encoding predictive information in each context representation for its corresponding response. Extensive experiments are conducted to verify the robustness of the proposed representation learning mechanism. By using both reference-based and reference-free evaluation metrics, we provide detailed analysis on the generated sentences, demonstrating the effectiveness of our proposed model.

pdf bib
A Scalable Framework for Learning From Implicit User Feedback to Improve Natural Language Understanding in Large-Scale Conversational AI Systems
Sunghyun Park | Han Li | Ameen Patel | Sidharth Mudgal | Sungjin Lee | Young-Bum Kim | Spyros Matsoukas | Ruhi Sarikaya
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Natural Language Understanding (NLU) is an established component within a conversational AI or digital assistant system, and it is responsible for producing semantic understanding of a user request. We propose a scalable and automatic approach for improving NLU in a large-scale conversational AI system by leveraging implicit user feedback, with an insight that user interaction data and dialog context have rich information embedded from which user satisfaction and intention can be inferred. In particular, we propose a domain-agnostic framework for curating new supervision data for improving NLU from live production traffic. With an extensive set of experiments, we show the results of applying the framework and improving NLU for a large-scale production system across 10 domains.

pdf bib
Learning Slice-Aware Representations with Mixture of Attentions
Cheng Wang | Sungjin Lee | Sunghyun Park | Han Li | Young-Bum Kim | Ruhi Sarikaya
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
Self-Supervised Contrastive Learning for Efficient User Satisfaction Prediction in Conversational Agents
Mohammad Kachuee | Hao Yuan | Young-Bum Kim | Sungjin Lee
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Turn-level user satisfaction is one of the most important performance metrics for conversational agents. It can be used to monitor the agent’s performance and provide insights about defective user experiences. While end-to-end deep learning has shown promising results, having access to a large number of reliable annotated samples required by these methods remains challenging. In a large-scale conversational system, there is a growing number of newly developed skills, making the traditional data collection, annotation, and modeling process impractical due to the required annotation costs and the turnaround times. In this paper, we suggest a self-supervised contrastive learning approach that leverages the pool of unlabeled data to learn user-agent interactions. We show that the pre-trained models using the self-supervised objective are transferable to the user satisfaction prediction. In addition, we propose a novel few-shot transfer learning approach that ensures better transferability for very small sample sizes. The suggested few-shot method does not require any inner loop optimization process and is scalable to very large datasets and complex models. Based on our experiments using real data from a large-scale commercial system, the suggested approach is able to significantly reduce the required number of annotations, while improving the generalization on unseen skills.

pdf bib
AugNLG: Few-shot Natural Language Generation using Self-trained Data Augmentation
Xinnuo Xu | Guoyin Wang | Young-Bum Kim | Sungjin Lee
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Natural Language Generation (NLG) is a key component in a task-oriented dialogue system, which converts the structured meaning representation (MR) to the natural language. For large-scale conversational systems, where it is common to have over hundreds of intents and thousands of slots, neither template-based approaches nor model-based approaches are scalable. Recently, neural NLGs started leveraging transfer learning and showed promising results in few-shot settings. This paper proposes AugNLG, a novel data augmentation approach that combines a self-trained neural retrieval model with a few-shot learned NLU model, to automatically create MR-to-Text data from open-domain texts. The proposed system mostly outperforms the state-of-the-art methods on the FewshotWOZ data in both BLEU and Slot Error Rate. We further confirm improved results on the FewshotSGD data and provide comprehensive analysis results on key components of our system. Our code and data are available at https://github.com/XinnuoXu/AugNLG.

2020

pdf bib
Guided Dialogue Policy Learning without Adversarial Learning in the Loop
Ziming Li | Sungjin Lee | Baolin Peng | Jinchao Li | Julia Kiseleva | Maarten de Rijke | Shahin Shayandeh | Jianfeng Gao
Findings of the Association for Computational Linguistics: EMNLP 2020

Reinforcement learning methods have emerged as a popular choice for training an efficient and effective dialogue policy. However, these methods suffer from sparse and unstable reward signals returned by a user simulator only when a dialogue finishes. Besides, the reward signal is manually designed by human experts, which requires domain knowledge. Recently, a number of adversarial learning methods have been proposed to learn the reward function together with the dialogue policy. However, to alternatively update the dialogue policy and the reward model on the fly, we are limited to policy-gradient-based algorithms, such as REINFORCE and PPO. Moreover, the alternating training of a dialogue agent and the reward model can easily get stuck in local optima or result in mode collapse. To overcome the listed issues, we propose to decompose the adversarial training into two steps. First, we train the discriminator with an auxiliary dialogue generator and then incorporate a derived reward model into a common reinforcement learning method to guide the dialogue policy learning. This approach is applicable to both on-policy and off-policy reinforcement learning methods. Based on our extensive experimentation, we can conclude the proposed method: (1) achieves a remarkable task success rate using both on-policy and off-policy reinforcement learning methods; and (2) has potential to transfer knowledge from existing domains to a new domain.

2019

pdf bib
ConvLab: Multi-Domain End-to-End Dialog System Platform
Sungjin Lee | Qi Zhu | Ryuichi Takanobu | Zheng Zhang | Yaoqin Zhang | Xiang Li | Jinchao Li | Baolin Peng | Xiujun Li | Minlie Huang | Jianfeng Gao
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

We present ConvLab, an open-source multi-domain end-to-end dialog system platform, that enables researchers to quickly set up experiments with reusable components and compare a large set of different approaches, ranging from conventional pipeline systems to end-to-end neural models, in common environments. ConvLab offers a set of fully annotated datasets and associated pre-trained reference models. As a showcase, we extend the MultiWOZ dataset with user dialog act annotations to train all component models and demonstrate how ConvLab makes it easy and effortless to conduct complicated experiments in multi-domain end-to-end dialog settings.

pdf bib
Data-Efficient Goal-Oriented Conversation with Dialogue Knowledge Transfer Networks
Igor Shalyminov | Sungjin Lee | Arash Eshghi | Oliver Lemon
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Goal-oriented dialogue systems are now being widely adopted in industry where it is of key importance to maintain a rapid prototyping cycle for new products and domains. Data-driven dialogue system development has to be adapted to meet this requirement — therefore, reducing the amount of data and annotations necessary for training such systems is a central research problem. In this paper, we present the Dialogue Knowledge Transfer Network (DiKTNet), a state-of-the-art approach to goal-oriented dialogue generation which only uses a few example dialogues (i.e. few-shot learning), none of which has to be annotated. We achieve this by performing a 2-stage training. Firstly, we perform unsupervised dialogue representation pre-training on a large source of goal-oriented dialogues in multiple domains, the MetaLWOz corpus. Secondly, at the transfer stage, we train DiKTNet using this representation together with 2 other textual knowledge sources with different levels of generality: ELMo encoder and the main dataset’s source domains. Our main dataset is the Stanford Multi-Domain dialogue corpus. We evaluate our model on it in terms of BLEU and Entity F1 scores, and show that our approach significantly and consistently improves upon a series of baseline models as well as over the previous state-of-the-art dialogue generation model, ZSDG. The improvement upon the latter — up to 10% in Entity F1 and the average of 3% in BLEU score — is achieved using only 10% equivalent of ZSDG’s in-domain training data.

pdf bib
Structuring Latent Spaces for Stylized Response Generation
Xiang Gao | Yizhe Zhang | Sungjin Lee | Michel Galley | Chris Brockett | Jianfeng Gao | Bill Dolan
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Generating responses in a targeted style is a useful yet challenging task, especially in the absence of parallel data. With limited data, existing methods tend to generate responses that are either less stylized or less context-relevant. We propose StyleFusion, which bridges conversation modeling and non-parallel style transfer by sharing a structured latent space. This structure allows the system to generate stylized relevant responses by sampling in the neighborhood of the conversation model prediction, and continuously control the style level. We demonstrate this method using dialogues from Reddit data and two sets of sentences with distinct styles (arXiv and Sherlock Holmes novels). Automatic and human evaluation show that, without sacrificing appropriateness, the system generates responses of the targeted style and outperforms competitive baselines.

pdf bib
Generating a Common Question from Multiple Documents using Multi-source Encoder-Decoder Models
Woon Sang Cho | Yizhe Zhang | Sudha Rao | Chris Brockett | Sungjin Lee
Proceedings of the 3rd Workshop on Neural Generation and Translation

Ambiguous user queries in search engines result in the retrieval of documents that often span multiple topics. One potential solution is for the search engine to generate multiple refined queries, each of which relates to a subset of the documents spanning the same topic. A preliminary step towards this goal is to generate a question that captures common concepts of multiple documents. We propose a new task of generating common question from multiple documents and present simple variant of an existing multi-source encoder-decoder framework, called the Multi-Source Question Generator (MSQG). We first train an RNN-based single encoder-decoder generator from (single document, question) pairs. At test time, given multiple documents, the Distribute step of our MSQG model predicts target word distributions for each document using the trained model. The Aggregate step aggregates these distributions to generate a common question. This simple yet effective strategy significantly outperforms several existing baseline models applied to the new task when evaluated using automated metrics and human judgments on the MS-MARCO-QA dataset.

pdf bib
Few-Shot Dialogue Generation Without Annotated Data: A Transfer Learning Approach
Igor Shalyminov | Sungjin Lee | Arash Eshghi | Oliver Lemon
Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue

Learning with minimal data is one of the key challenges in the development of practical, production-ready goal-oriented dialogue systems. In a real-world enterprise setting where dialogue systems are developed rapidly and are expected to work robustly for an ever-growing variety of domains, products, and scenarios, efficient learning from a limited number of examples becomes indispensable. In this paper, we introduce a technique to achieve state-of-the-art dialogue generation performance in a few-shot setup, without using any annotated data. We do this by leveraging background knowledge from a larger, more highly represented dialogue source — namely, the MetaLWOz dataset. We evaluate our model on the Stanford Multi-Domain Dialogue Dataset, consisting of human-human goal-oriented dialogues in in-car navigation, appointment scheduling, and weather information domains. We show that our few-shot approach achieves state-of-the art results on that dataset by consistently outperforming the previous best model in terms of BLEU and Entity F1 scores, while being more data-efficient than it by not requiring any data annotation.

pdf bib
Unsupervised Dialogue Spectrum Generation for Log Dialogue Ranking
Xinnuo Xu | Yizhe Zhang | Lars Liden | Sungjin Lee
Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue

Although the data-driven approaches of some recent bot building platforms make it possible for a wide range of users to easily create dialogue systems, those platforms don’t offer tools for quickly identifying which log dialogues contain problems. This is important since corrections to log dialogues provide a means to improve performance after deployment. A log dialogue ranker, which ranks problematic dialogues higher, is an essential tool due to the sheer volume of log dialogues that could be generated. However, training a ranker typically requires labelling a substantial amount of data, which is not feasible for most users. In this paper, we present a novel unsupervised approach for dialogue ranking using GANs and release a corpus of labelled dialogues for evaluation and comparison with supervised methods. The evaluation result shows that our method compares favorably to supervised methods without any labelled data.

pdf bib
Jointly Optimizing Diversity and Relevance in Neural Response Generation
Xiang Gao | Sungjin Lee | Yizhe Zhang | Chris Brockett | Michel Galley | Jianfeng Gao | Bill Dolan
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Although recent neural conversation models have shown great potential, they often generate bland and generic responses. While various approaches have been explored to diversify the output of the conversation model, the improvement often comes at the cost of decreased relevance. In this paper, we propose a SpaceFusion model to jointly optimize diversity and relevance that essentially fuses the latent space of a sequence-to-sequence model and that of an autoencoder model by leveraging novel regularization terms. As a result, our approach induces a latent space in which the distance and direction from the predicted response vector roughly match the relevance and diversity, respectively. This property also lends itself well to an intuitive visualization of the latent space. Both automatic and human evaluation results demonstrate that the proposed approach brings significant improvement compared to strong baselines in both diversity and relevance.

2017

pdf bib
Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning
Baolin Peng | Xiujun Li | Lihong Li | Jianfeng Gao | Asli Celikyilmaz | Sungjin Lee | Kam-Fai Wong
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Building a dialogue agent to fulfill complex tasks, such as travel planning, is challenging because the agent has to learn to collectively complete multiple subtasks. For example, the agent needs to reserve a hotel and book a flight so that there leaves enough time for commute between arrival and hotel check-in. This paper addresses this challenge by formulating the task in the mathematical framework of options over Markov Decision Processes (MDPs), and proposing a hierarchical deep reinforcement learning approach to learning a dialogue manager that operates at different temporal scales. The dialogue manager consists of: (1) a top-level dialogue policy that selects among subtasks or options, (2) a low-level dialogue policy that selects primitive actions to complete the subtask given by the top-level policy, and (3) a global state tracker that helps ensure all cross-subtask constraints be satisfied. Experiments on a travel planning task with simulated and real users show that our approach leads to significant improvements over three baselines, two based on handcrafted rules and the other based on flat deep reinforcement learning.

2016

pdf bib
Task Lineages: Dialog State Tracking for Flexible Interaction
Sungjin Lee | Amanda Stent
Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue

2015

pdf bib
Joint Embedding of Query and Ad by Leveraging Implicit Feedback
Sungjin Lee | Yifan Hu
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Online Sentence Novelty Scoring for Topical Document Streams
Sungjin Lee
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
The Real Challenge 2014: Progress and Prospects
Maxine Eskenazi | Alan W Black | Sungjin Lee | David Traum
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue

pdf bib
The Cohort and Speechify Libraries for Rapid Construction of Speech Enabled Applications for Android
Tejaswi Kasturi | Haojian Jin | Aasish Pappu | Sungjin Lee | Beverley Harrison | Ramana Murthy | Amanda Stent
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue

2014

pdf bib
Extrinsic Evaluation of Dialog State Tracking and Predictive Metrics for Dialog Policy Optimization
Sungjin Lee
Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL)

2013

pdf bib
Recipe For Building Robust Spoken Dialog State Trackers: Dialog State Tracking Challenge System Description
Sungjin Lee | Maxine Eskenazi
Proceedings of the SIGDIAL 2013 Conference

pdf bib
Structured Discriminative Model For Dialog State Tracking
Sungjin Lee
Proceedings of the SIGDIAL 2013 Conference

2012

pdf bib
An Unsupervised Approach to User Simulation: Toward Self-Improving Dialog Systems
Sungjin Lee | Maxine Eskenazi
Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue

pdf bib
Exploiting Machine-Transcribed Dialog Corpus to Improve Multiple Dialog States Tracking Methods
Sungjin Lee | Maxine Eskenazi
Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue

2011

pdf bib
POMY: A Conversational Virtual Environment for Language Learning in POSTECH
Hyungjong Noh | Kyusong Lee | Sungjin Lee | Gary Geunbae Lee
Proceedings of the SIGDIAL 2011 Conference

2009

pdf bib
Realistic Grammar Error Simulation using Markov Logic
Sungjin Lee | Gary Geunbae Lee
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

1994

pdf bib
A Two-Level Morphological Analysis of Korean
Deok-Bong Kim | Sung-Jin Lee | Key-Sun Choi | Gil-Chang Kim
COLING 1994 Volume 1: The 15th International Conference on Computational Linguistics