Exploiting domain-slot related keywords description for Few-Shot Cross-Domain Dialogue State Tracking
Gao Qixiang | Guanting Dong | Yutao Mou | Liwen Wang | Chen Zeng | Daichi Guo | Mingyang Sun | Weiran Xu
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Collecting dialogue data with domain-slot-value labels for dialogue state tracking (DST) could be a costly process. In this paper, we propose a novel framework based on domain-slot related description to tackle the challenge of few-shot cross-domain DST. Specifically, we design an extraction module to extract domain-slot related verbs and nouns in the dialogue. Then, we integrates them into the description, which aims to prompt the model to identify the slot information. Furthermore, we introduce a random sampling strategy to improve the domain generalization ability of the model. We utilize a pre-trained model to encode contexts and description and generates answers with an auto-regressive manner. Experimental results show that our approaches substantially outperform the existing few-shot DST methods on MultiWOZ and gain strong improvements on the slot accuracy comparing to existing slot description methods.
PSSAT: A Perturbed Semantic Structure Awareness Transferring Method for Perturbation-Robust Slot Filling
Guanting Dong | Daichi Guo | Liwen Wang | Xuefeng Li | Zechen Wang | Chen Zeng | Keqing He | Jinzheng Zhao | Hao Lei | Xinyue Cui | Yi Huang | Junlan Feng | Weiran Xu
Proceedings of the 29th International Conference on Computational Linguistics
Most existing slot filling models tend to memorize inherent patterns of entities and corresponding contexts from training data. However, these models can lead to system failure or undesirable outputs when being exposed to spoken language perturbation or variation in practice. We propose a perturbed semantic structure awareness transferring method for training perturbation-robust slot filling models. Specifically, we introduce two MLM-based training strategies to respectively learn contextual semantic structure and word distribution from unsupervised language perturbation corpus. Then, we transfer semantic knowledge learned from upstream training procedure into the original samples and filter generated data by consistency processing. These procedures aims to enhance the robustness of slot filling models. Experimental results show that our method consistently outperforms the previous basic methods and gains strong generalization while preventing the model from memorizing inherent patterns of entities and contexts.
- Guanting Dong 2
- Liwen Wang 2
- Daichi Guo 2
- Weiran Xu 2
- Gao Qixiang 1
- show all...