Securely Capturing People’s Interactions with Voice Assistants at Home: A Bespoke Tool for Ethical Data Collection

Angus Addlesee


Abstract
Speech production is nuanced and unique to every individual, but today’s Spoken Dialogue Systems (SDSs) are trained to use general speech patterns to successfully improve performance on various evaluation metrics. However, these patterns do not apply to certain user groups - often the very people that can benefit the most from SDSs. For example, people with dementia produce more disfluent speech than the general population. In order to evaluate systems with specific user groups in mind, and to guide the design of such systems to deliver maximum benefit to these users, data must be collected securely. In this short paper we present CVR-SI, a bespoke tool for ethical data collection. Designed for the healthcare domain, we argue that it should also be used in more general settings. We detail how off-the-shelf solutions fail to ensure that sensitive data remains secure and private. We then describe the ethical design and security features of our device, with a full guide on how to build both the hardware and software components of CVR-SI. Our design ensures inclusivity to all researchers in this field, particularly those who are not hardware experts. This guarantees everyone can collect appropriate data for human evaluation ethically, securely, and in a timely manner.
Anthology ID:
2022.nlp4pi-1.3
Volume:
Proceedings of the Second Workshop on NLP for Positive Impact (NLP4PI)
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates (Hybrid)
Editors:
Laura Biester, Dorottya Demszky, Zhijing Jin, Mrinmaya Sachan, Joel Tetreault, Steven Wilson, Lu Xiao, Jieyu Zhao
Venue:
NLP4PI
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
25–30
Language:
URL:
https://aclanthology.org/2022.nlp4pi-1.3
DOI:
10.18653/v1/2022.nlp4pi-1.3
Bibkey:
Cite (ACL):
Angus Addlesee. 2022. Securely Capturing People’s Interactions with Voice Assistants at Home: A Bespoke Tool for Ethical Data Collection. In Proceedings of the Second Workshop on NLP for Positive Impact (NLP4PI), pages 25–30, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
Cite (Informal):
Securely Capturing People’s Interactions with Voice Assistants at Home: A Bespoke Tool for Ethical Data Collection (Addlesee, NLP4PI 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.nlp4pi-1.3.pdf