Ridong Jiang
2023
Speech-Aware Multi-Domain Dialogue State Generation with ASR Error Correction Modules
Ridong Jiang
|
Wei Shi
|
Bin Wang
|
Chen Zhang
|
Yan Zhang
|
Chunlei Pan
|
Jung Jae Kim
|
Haizhou Li
Proceedings of The Eleventh Dialog System Technology Challenge
Prior research on dialogue state tracking (DST) is mostly based on written dialogue corpora. For spoken dialogues, the DST model trained on the written text should use the results (or hypothesis) of automatic speech recognition (ASR) as input. But ASR hypothesis often includes errors, which leads to significant performance drop for spoken dialogue state tracking. We address the issue by developing the following ASR error correction modules. First, we train a model to convert ASR hypothesis to ground truth user utterance, which can fix frequent patterns of errors. The model takes ASR hypotheses of two ASR models as input and fine-tuned in two stages. The corrected hypothesis is fed into a large scale pre-trained encoder-decoder model (T5) for DST training and inference. Second, if an output slot value from the encoder-decoder model is a name, we compare it with names in a dictionary crawled from Web sites and, if feasible, replace with the crawled name of the shortest edit distance. Third, we fix errors of temporal expressions in ASR hypothesis by using hand-crafted rules. Experiment results on the DSTC 11 speech-aware dataset, which is built on the popular MultiWOZ task (version 2.1), show that our proposed method can effectively mitigate the performance drop when moving from written text to spoken conversations.
2020
GCDST: A Graph-based and Copy-augmented Multi-domain Dialogue State Tracking
Peng Wu
|
Bowei Zou
|
Ridong Jiang
|
AiTi Aw
Findings of the Association for Computational Linguistics: EMNLP 2020
As an essential component of task-oriented dialogue systems, Dialogue State Tracking (DST) takes charge of estimating user intentions and requests in dialogue contexts and extracting substantial goals (states) from user utterances to help the downstream modules to determine the next actions of dialogue systems. For practical usages, a major challenge to constructing a robust DST model is to process a conversation with multi-domain states. However, most existing approaches trained DST on a single domain independently, ignoring the information across domains. To tackle the multi-domain DST task, we first construct a dialogue state graph to transfer structured features among related domain-slot pairs across domains. Then, we encode the graph information of dialogue states by graph convolutional networks and utilize a hard copy mechanism to directly copy historical states from the previous conversation. Experimental results show that our model improves the performances of the multi-domain DST baseline (TRADE) with the absolute joint accuracy of 2.0% and 1.0% on the MultiWOZ 2.0 and 2.1 dialogue datasets, respectively.
2016
Evaluating and Combining Name Entity Recognition Systems
Ridong Jiang
|
Rafael E. Banchs
|
Haizhou Li
Proceedings of the Sixth Named Entity Workshop
2013
AIDA: Artificial Intelligent Dialogue Agent
Rafael E. Banchs
|
Ridong Jiang
|
Seokhwan Kim
|
Arthur Niswar
|
Kheng Hui Yeo
Proceedings of the SIGDIAL 2013 Conference
Search
Fix data
Co-authors
- Rafael E. Banchs 2
- Haizhou Li 2
- Aiti Aw 1
- Jung-jae Kim 1
- Seokhwan Kim 1
- show all...