HybriDialogue: An Information-Seeking Dialogue Dataset Grounded on Tabular and Textual Data

Kai Nakamura, Sharon Levy, Yi-Lin Tuan, Wenhu Chen, William Yang Wang


Abstract
A pressing challenge in current dialogue systems is to successfully converse with users on topics with information distributed across different modalities. Previous work in multiturn dialogue systems has primarily focused on either text or table information. In more realistic scenarios, having a joint understanding of both is critical as knowledge is typically distributed over both unstructured and structured forms. We present a new dialogue dataset, HybriDialogue, which consists of crowdsourced natural conversations grounded on both Wikipedia text and tables. The conversations are created through the decomposition of complex multihop questions into simple, realistic multiturn dialogue interactions. We propose retrieval, system state tracking, and dialogue response generation tasks for our dataset and conduct baseline experiments for each. Our results show that there is still ample opportunity for improvement, demonstrating the importance of building stronger dialogue systems that can reason over the complex setting of informationseeking dialogue grounded on tables and text.
Anthology ID:
2022.findings-acl.41
Volume:
Findings of the Association for Computational Linguistics: ACL 2022
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
481–492
Language:
URL:
https://aclanthology.org/2022.findings-acl.41
DOI:
10.18653/v1/2022.findings-acl.41
Bibkey:
Cite (ACL):
Kai Nakamura, Sharon Levy, Yi-Lin Tuan, Wenhu Chen, and William Yang Wang. 2022. HybriDialogue: An Information-Seeking Dialogue Dataset Grounded on Tabular and Textual Data. In Findings of the Association for Computational Linguistics: ACL 2022, pages 481–492, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
HybriDialogue: An Information-Seeking Dialogue Dataset Grounded on Tabular and Textual Data (Nakamura et al., Findings 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.findings-acl.41.pdf
Video:
 https://aclanthology.org/2022.findings-acl.41.mp4
Data
CoQADoQAHybridQANatural QuestionsOTT-QARecipeQASQAShARC