Improving Dialogue State Tracking with Turn-based Loss Function and Sequential Data Augmentation
Jarana Manotumruksa | Jeff Dalton | Edgar Meij | Emine Yilmaz
Findings of the Association for Computational Linguistics: EMNLP 2021
While state-of-the-art Dialogue State Tracking (DST) models show promising results, all of them rely on a traditional cross-entropy loss function during the training process, which may not be optimal for improving the joint goal accuracy. Although several approaches recently propose augmenting the training set by copying user utterances and replacing the real slot values with other possible or even similar values, they are not effective at improving the performance of existing DST models. To address these challenges, we propose a Turn-based Loss Function (TLF) that penalises the model if it inaccurately predicts a slot value at the early turns more so than in later turns in order to improve joint goal accuracy. We also propose a simple but effective Sequential Data Augmentation (SDA) algorithm to generate more complex user utterances and system responses to effectively train existing DST models. Experimental results on two standard DST benchmark collections demonstrate that our proposed TLF and SDA techniques significantly improve the effectiveness of the state-of-the-art DST model by approximately 7-8% relative reduction in error and achieves a new state-of-the-art joint goal accuracy with 59.50 and 54.90 on MultiWOZ2.1 and MultiWOZ2.2, respectively.
Semantic hashing is a powerful paradigm for representing texts as compact binary hash codes. The explosion of short text data has spurred the demand of few-bits hashing. However, the performance of existing semantic hashing methods cannot be guaranteed when applied to few-bits hashing because of severe information loss. In this paper, we present a simple but effective unsupervised neural generative semantic hashing method with a focus on few-bits hashing. Our model is built upon variational autoencoder and represents each hash bit as a Bernoulli variable, which allows the model to be end-to-end trainable. To address the issue of information loss, we introduce a set of auxiliary implicit topic vectors. With the aid of these topic vectors, the generated hash codes are not only low-dimensional representations of the original texts but also capture their implicit topics. We conduct comprehensive experiments on four datasets. The results demonstrate that our approach achieves significant improvements over state-of-the-art semantic hashing methods in few-bits hashing.