Tisha Anders
2023
Grounding Description-Driven Dialogue State Trackers with Knowledge-Seeking Turns
Alexandru Coca
|
Bo-Hsiang Tseng
|
Jinghong Chen
|
Weizhe Lin
|
Weixuan Zhang
|
Tisha Anders
|
Bill Byrne
Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Schema-guided dialogue state trackers can generalise to new domains without further training, yet they are sensitive to the writing style of the schemata. Augmenting the training set with human or synthetic schema paraphrases improves the model robustness to these variations but can be either costly or difficult to control. We propose to circumvent these issues by grounding the state tracking model in knowledge-seeking turns collected from the dialogue corpus as well as the schema. Including these turns in prompts during finetuning and inference leads to marked improvements in model robustness, as demonstrated by large average joint goal accuracy and schema sensitivity improvements on SGD and SGD-X.
2022
uFACT: Unfaithful Alien-Corpora Training for Semantically Consistent Data-to-Text Generation
Tisha Anders
|
Alexandru Coca
|
Bill Byrne
Findings of the Association for Computational Linguistics: ACL 2022
We propose uFACT (Un-Faithful Alien Corpora Training), a training corpus construction method for data-to-text (d2t) generation models. We show that d2t models trained on uFACT datasets generate utterances which represent the semantic content of the data sources more accurately compared to models trained on the target corpus alone. Our approach is to augment the training set of a given target corpus with alien corpora which have different semantic representations. We show that while it is important to have faithful data from the target corpus, the faithfulness of additional corpora only plays a minor role. Consequently, uFACT datasets can be constructed with large quantities of unfaithful data. We show how uFACT can be leveraged to obtain state-of-the-art results on the WebNLG benchmark using METEOR as our performance metric. Furthermore, we investigate the sensitivity of the generation faithfulness to the training corpus structure using the PARENT metric, and provide a baseline for this metric on the WebNLG (Gardent et al., 2017) benchmark to facilitate comparisons with future work.
Search
Fix data
Co-authors
- Bill Byrne 2
- Alexandru Coca 2
- Jinghong Chen 1
- Weizhe Lin 1
- Bo-Hsiang Tseng 1
- show all...