Cathy Jiao


2024

pdf bib
Examining Prosody in Spoken Navigation Instructions for People with Disabilities
Cathy Jiao | Aaron Steinfeld | Maxine Eskenazi
Proceedings of the Third Workshop on Bridging Human--Computer Interaction and Natural Language Processing

The introduction of conversational systems have made synthesized speech technologies common tools for daily activities. However, not all synthetic speech systems are designed with the needs of people with disabilities in mind. This paper describes a study in which 198 people – 80 participants with self-reported disabilities and 118 participants without – were recruited to listen to navigation instructions from a spoken dialogue system with different prosodic features. Results showed that slowing down speech rate aids in participants’ number recall, but not in noun recall. From our results, we provide suggestions for developers for building accessible synthetic speech systems.

2022

pdf bib
The DialPort tools
Jessica Huynh | Shikib Mehri | Cathy Jiao | Maxine Eskenazi
Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue

The DialPort project (http://dialport.org/), funded by the National Science Foundation (NSF), covers a group of tools and services that aim at fulfilling the needs of the dialog research community. Over the course of six years, several offerings have been created, including the DialPort Portal and DialCrowd. This paper describes these contributions, which will be demoed at SIGDIAL, including implementation, prior studies, corresponding discoveries, and the locations at which the tools will remain freely available to the community going forward.

pdf bib
InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning
Prakhar Gupta | Cathy Jiao | Yi-Ting Yeh | Shikib Mehri | Maxine Eskenazi | Jeffrey Bigham
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Instruction tuning is an emergent paradigm in NLP wherein natural language instructions are leveraged with language models to induce zero-shot performance on unseen tasks. Dialogue is an especially interesting area in which to explore instruction tuning because dialogue systems perform multiple kinds of tasks related to language (e.g., natural language understanding and generation, domain-specific interaction), yet instruction tuning has not been systematically explored for dialogue-related tasks. We introduce InstructDial, an instruction tuning framework for dialogue, which consists of a repository of 48 diverse dialogue tasks in a unified text-to-text format created from 59 openly available dialogue datasets. We explore cross-task generalization ability on models tuned on InstructDial across diverse dialogue tasks. Our analysis reveals that InstructDial enables good zero-shot performance on unseen datasets and tasks such as dialogue evaluation and intent detection, and even better performance in a few-shot setting. To ensure that models adhere to instructions, we introduce novel meta-tasks. We establish benchmark zero-shot and few-shot performance of models trained using the proposed framework on multiple dialogue tasks.

pdf bib
Improving compositional generalization for multi-step quantitative reasoning in question answering
Armineh Nourbakhsh | Cathy Jiao | Sameena Shah | Carolyn Rosé
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Quantitative reasoning is an important aspect of question answering, especially when numeric and verbal cues interact to indicate sophisticated, multi-step programs. In this paper, we demonstrate how modeling the compositional nature of quantitative text can enhance the performance and robustness of QA models, allowing them to capture arithmetic logic that is expressed verbally. Borrowing from the literature on semantic parsing, we propose a method that encourages the QA models to adjust their attention patterns and capture input/output alignments that are meaningful to the reasoning task. We show how this strategy improves program accuracy and renders the models more robust against overfitting as the number of reasoning steps grows. Our approach is designed as a standalone module which can be prepended to many existing models and trained in an end-to-end fashion without the need for additional supervisory signal. As part of this exercise, we also create a unified dataset building on four previously released numerical QA datasets over tabular data.