Improving Large-Scale Conversational Assistants using Model Interpretation based Training Sample Selection

Stefan Schroedl, Manoj Kumar, Kiana Hajebi, Morteza Ziyadi, Sriram Venkatapathy, Anil Ramakrishna, Rahul Gupta, Pradeep Natarajan


Abstract
This paper presents an approach to identify samples from live traffic where the customer implicitly communicated satisfaction with Alexa’s responses, by leveraging interpretations of model behavior. Such customer signals are noisy and adding a large number of samples from live traffic to training set makes re-training infeasible. Our work addresses these challenges by identifying a small number of samples that grow training set by ~0.05% while producing statistically significant improvements in both offline and online tests.
Anthology ID:
2022.emnlp-industry.37
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track
Month:
December
Year:
2022
Address:
Abu Dhabi, UAE
Editors:
Yunyao Li, Angeliki Lazaridou
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
371–378
Language:
URL:
https://aclanthology.org/2022.emnlp-industry.37
DOI:
10.18653/v1/2022.emnlp-industry.37
Bibkey:
Cite (ACL):
Stefan Schroedl, Manoj Kumar, Kiana Hajebi, Morteza Ziyadi, Sriram Venkatapathy, Anil Ramakrishna, Rahul Gupta, and Pradeep Natarajan. 2022. Improving Large-Scale Conversational Assistants using Model Interpretation based Training Sample Selection. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 371–378, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Improving Large-Scale Conversational Assistants using Model Interpretation based Training Sample Selection (Schroedl et al., EMNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.emnlp-industry.37.pdf