An Active Learning Pipeline for NLU Error Detection in Conversational Agents

Damian Pascual, Aritz Bercher, Akansha Bhardwaj, Mingbo Cui, Dominic Kohler, Liam Van Der Poel, Paolo Rosso


Abstract
High-quality labeled data is paramount to the performance of modern machine learning models. However, annotating data is a time-consuming and costly process that requires human experts to examine large collections of raw data. For conversational agents in production settings with access to large amounts of user-agent conversations, the challenge is to decide what data should be annotated first. We consider the Natural Language Understanding (NLU) component of a conversational agent deployed in a real-world setup with limited resources. We present an active learning pipeline for offline detection of classification errors that leverages two strong classifiers. Then, we perform topic modeling on the potentially mis-classified samples to ease data analysis and to reveal error patterns. In our experiments, we show on a real-world dataset that by using our method to prioritize data annotation we reach 100% of the performance annotating only 36% of the data. Finally, we present an analysis of some of the error patterns revealed and argue that our pipeline is a valuable tool to detect critical errors and reduce the workload of annotators.
Anthology ID:
2023.law-1.6
Volume:
Proceedings of the 17th Linguistic Annotation Workshop (LAW-XVII)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Jakob Prange, Annemarie Friedrich
Venue:
LAW
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
55–60
Language:
URL:
https://aclanthology.org/2023.law-1.6
DOI:
10.18653/v1/2023.law-1.6
Bibkey:
Cite (ACL):
Damian Pascual, Aritz Bercher, Akansha Bhardwaj, Mingbo Cui, Dominic Kohler, Liam Van Der Poel, and Paolo Rosso. 2023. An Active Learning Pipeline for NLU Error Detection in Conversational Agents. In Proceedings of the 17th Linguistic Annotation Workshop (LAW-XVII), pages 55–60, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
An Active Learning Pipeline for NLU Error Detection in Conversational Agents (Pascual et al., LAW 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.law-1.6.pdf