Sanchit Agarwal


pdf bib
Few Shot Dialogue State Tracking using Meta-learning
Saket Dingliwal | Shuyang Gao | Sanchit Agarwal | Chien-Wei Lin | Tagyoung Chung | Dilek Hakkani-Tur
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Dialogue State Tracking (DST) forms a core component of automated chatbot based systems designed for specific goals like hotel, taxi reservation, tourist information etc. With the increasing need to deploy such systems in new domains, solving the problem of zero/few-shot DST has become necessary. There has been a rising trend for learning to transfer knowledge from resource-rich domains to unknown domains with minimal need for additional data. In this work, we explore the merits of meta-learning algorithms for this transfer and hence, propose a meta-learner D-REPTILE specific to the DST problem. With extensive experimentation, we provide clear evidence of benefits over conventional approaches across different domains, methods, base models and datasets with significant (5-25%) improvement over the baseline in a low-data setting. Our proposed meta-learner is agnostic of the underlying model and hence any existing state-of-the-art DST system can improve its performance on unknown domains using our training strategy.

pdf bib
Alexa Conversations: An Extensible Data-driven Approach for Building Task-oriented Dialogue Systems
Anish Acharya | Suranjit Adhikari | Sanchit Agarwal | Vincent Auvray | Nehal Belgamwar | Arijit Biswas | Shubhra Chandra | Tagyoung Chung | Maryam Fazel-Zarandi | Raefer Gabriel | Shuyang Gao | Rahul Goel | Dilek Hakkani-Tur | Jan Jezabek | Abhay Jha | Jiun-Yu Kao | Prakash Krishnan | Peter Ku | Anuj Goyal | Chien-Wei Lin | Qing Liu | Arindam Mandal | Angeliki Metallinou | Vishal Naik | Yi Pan | Shachi Paul | Vittorio Perera | Abhishek Sethi | Minmin Shen | Nikko Strom | Eddie Wang
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Demonstrations

Traditional goal-oriented dialogue systems rely on various components such as natural language understanding, dialogue state tracking, policy learning and response generation. Training each component requires annotations which are hard to obtain for every new domain, limiting scalability of such systems. Similarly, rule-based dialogue systems require extensive writing and maintenance of rules and do not scale either. End-to-End dialogue systems, on the other hand, do not require module-specific annotations but need a large amount of data for training. To overcome these problems, in this demo, we present Alexa Conversations, a new approach for building goal-oriented dialogue systems that is scalable, extensible as well as data efficient. The components of this system are trained in a data-driven manner, but instead of collecting annotated conversations for training, we generate them using a novel dialogue simulator based on a few seed dialogues and specifications of APIs and entities provided by the developer. Our approach provides out-of-the-box support for natural conversational phenomenon like entity sharing across turns or users changing their mind during conversation without requiring developers to provide any such dialogue flows. We exemplify our approach using a simple pizza ordering task and showcase its value in reducing the developer burden for creating a robust experience. Finally, we evaluate our system using a typical movie ticket booking task integrated with live APIs and show that the dialogue simulator is an essential component of the system that leads to over 50% improvement in turn-level action signature prediction accuracy.


pdf bib
MultiWOZ 2.1: A Consolidated Multi-Domain Dialogue Dataset with State Corrections and State Tracking Baselines
Mihail Eric | Rahul Goel | Shachi Paul | Abhishek Sethi | Sanchit Agarwal | Shuyang Gao | Adarsh Kumar | Anuj Goyal | Peter Ku | Dilek Hakkani-Tur
Proceedings of the 12th Language Resources and Evaluation Conference

MultiWOZ 2.0 (Budzianowski et al., 2018) is a recently released multi-domain dialogue dataset spanning 7 distinct domains and containing over 10,000 dialogues. Though immensely useful and one of the largest resources of its kind to-date, MultiWOZ 2.0 has a few shortcomings. Firstly, there are substantial noise in the dialogue state annotations and dialogue utterances which negatively impact the performance of state-tracking models. Secondly, follow-up work (Lee et al., 2019) has augmented the original dataset with user dialogue acts. This leads to multiple co-existent versions of the same dataset with minor modifications. In this work we tackle the aforementioned issues by introducing MultiWOZ 2.1. To fix the noisy state annotations, we use crowdsourced workers to re-annotate state and utterances based on the original utterances in the dataset. This correction process results in changes to over 32% of state annotations across 40% of the dialogue turns. In addition, we fix 146 dialogue utterances by canonicalizing slot values in the utterances to the values in the dataset ontology. To address the second problem, we combined the contributions of the follow-up works into MultiWOZ 2.1. Hence, our dataset also includes user dialogue acts as well as multiple slot descriptions per dialogue state slot. We then benchmark a number of state-of-the-art dialogue state tracking models on the MultiWOZ 2.1 dataset and show the joint state tracking performance on the corrected state annotations. We are publicly releasing MultiWOZ 2.1 to the community, hoping that this dataset resource will allow for more effective models across various dialogue subproblems to be built in the future.

pdf bib
From Machine Reading Comprehension to Dialogue State Tracking: Bridging the Gap
Shuyang Gao | Sanchit Agarwal | Di Jin | Tagyoung Chung | Dilek Hakkani-Tur
Proceedings of the 2nd Workshop on Natural Language Processing for Conversational AI

Dialogue state tracking (DST) is at the heart of task-oriented dialogue systems. However, the scarcity of labeled data is an obstacle to building accurate and robust state tracking systems that work across a variety of domains. Existing approaches generally require some dialogue data with state information and their ability to generalize to unknown domains is limited. In this paper, we propose using machine reading comprehension (RC) in state tracking from two perspectives: model architectures and datasets. We divide the slot types in dialogue state into categorical or extractive to borrow the advantages from both multiple-choice and span-based reading comprehension models. Our method achieves near the current state-of-the-art in joint goal accuracy on MultiWOZ 2.1 given full training data. More importantly, by leveraging machine reading comprehension datasets, our method outperforms the existing approaches by many a large margin in few-shot scenarios when the availability of in-domain data is limited. Lastly, even without any state tracking data, i.e., zero-shot scenario, our proposed approach achieves greater than 90% average slot accuracy in 12 out of 30 slots in MultiWOZ 2.1.


pdf bib
Dialog State Tracking: A Neural Reading Comprehension Approach
Shuyang Gao | Abhishek Sethi | Sanchit Agarwal | Tagyoung Chung | Dilek Hakkani-Tur
Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue

Dialog state tracking is used to estimate the current belief state of a dialog given all the preceding conversation. Machine reading comprehension, on the other hand, focuses on building systems that read passages of text and answer questions that require some understanding of passages. We formulate dialog state tracking as a reading comprehension task to answer the question what is the state of the current dialog? after reading conversational context. In contrast to traditional state tracking methods where the dialog state is often predicted as a distribution over a closed set of all the possible slot values within an ontology, our method uses a simple attention-based neural network to point to the slot values within the conversation. Experiments on MultiWOZ-2.0 cross-domain dialog dataset show that our simple system can obtain similar accuracies compared to the previous more complex methods. By exploiting recent advances in contextual word embeddings, adding a model that explicitly tracks whether a slot value should be carried over to the next turn, and combining our method with a traditional joint state tracking method that relies on closed set vocabulary, we can obtain a joint-goal accuracy of 47.33% on the standard test split, exceeding current state-of-the-art by 11.75%**.